Progress in Computer Science and Applied Logic 




Coding, Cryptography 
and Combinatorics 

Keqin Feng 
Harald Niederre 
Chaoping Xing 
Editors 



Springer Basel AG 




Progress in Computer Science and Applied Logic 

Volume 23 

Editor 

John C. Chemiavsky, National Science Foundation 



Associate Editors 

Robert Constable, Cornell University 
Jean Gallier, University of Pennsylvania 
Richard Platek, Cornell University 
Richard Statman, Camegie-Mellon University 




Coding, Cryptography 
and Combinatorics 

Keqin Feng 
Harald Niederreiter 
Chaoping Xing 
Editors 



Springer Basel AG 




Editors: 



Keqin Feng 

Department of Mathematical Sciences 
Tsinghua University 
Beijing 100084 
China 

kqfeng@math.hkbu.edu.hk 
kfeng@math. tsinghua. edu. cn 

Chaoping Xing 
Department of Mathematics 
National University of Singapore 
2 Science Drive 2 
Singapore 117543 
Republic of Singapore 
matxcp@nus.edu. sg 



Harald Niederreiter 
Department of Mathematics 
National University of Singapore 
2 Science Drive 2 
Singapore 117543 
Republic of Singapore 
nied@math.nus.edu.sg 



2000 Mathematics Subject Classification 1 1G18, 1 1G20, 1 1T71, 94A55, 94A60, 94A62, 
94B05, 94B35, 94B60, 94B65 



A CIP catalogue record for this book is available from the Library of Congress, 
Washington D.C., USA 

Bibliographic information published by Die Deutsche Bibliothek 

Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; 

detailed bibliographic data is available in the Internet at <http://dnb.ddb.de>. 

ISBN 978-3-0348-9602-3 

This work is subject to copyright. All rights are reserved, whether the whole or part of 
the material is concerned, specifically the rights of translation, reprinting, re-use of 
illustrations, recitation, broadcasting, reproduction on microfilms or in other ways, and 
storage in data banks. For any kind of use permission of the copyright owner must be 
obtained. 

© 2004 Springer Basel AG 

Originally published by Birkhauser Verlag, Basel in 2004 
Softcover reprint of the hardcover 1st edition 2004 

Printed on acid-free paper produced of chlorine-free pulp. TCF <*> 

ISBN 978-3-0348-9602-3 ISBN 978-3-0348-7865-4 (eBook) 

DOI 10.1007/978-3-0348-7865-4 



987654321 



www. birkhasuer- science . com 




Table of Contents 



Preface vii 

Invited Papers 

Claude Carlet 

On the Secondary Constructions of Resilient and Bent Functions 3 

Tadao Kasami 

Adaptive Recursive MLD Algorithm Based on Parallel 

Concatenation Decomposition for Binary Linear Codes 29 

Wen-Ching Winnie Li 

Modularity of Asymptotically Optimal Towers of Function Fields 51 

Peizhong Lu and Lianzhen Huang 

A New Correlation Attack on LFSR Sequences 

with High Error Tolerance 67 

Amin Shokrollahi 

LDPC Codes: An Introduction 85 

Contributed Papers 

Jintai Ding and Dieter Schmidt 

The New Implementation Schemes of 

the TTM Cryptosystem Are Not Secure 113 

Elona Erez and Meir Feder 

The Capacity Region of Broadcast Networks with Two Receivers 129 

Fang- Wei Fu , San Ling and Chaoping Xing 

Constructions of Nonbinary Codes Correcting ^-Symmetric Errors and 

Detecting All Unidirectional Errors: Magnitude Error Criterion 139 

Aline Gouget 

On the Propagation Criterion of Boolean Functions 153 

Tor Helleseth, Jyrki Lahtonen and Petri Rosendahl 
On Certain Equations over Finite Fields and 

Cross-Correlations of ra-Sequences 169 

Frangoise Levy-dit-Vehel and Ludovic Perret 

A Polly Cracker System Based on Satisfiability 177 




VI 



Table of Contents 



Jing Li 

Combinatorially Designed LDPC Codes Using Zech Logarithms 

and Congruential Sequences 193 

Lei Li and Shoulun Long 

New Constructions of Constant- Weight Codes 209 

San Ling and Patrick Sole 

Good Self-Dual Quasi-Cyclic Codes over F g , q Odd 223 

Wilfried Meidl 

Linear Complexity and fc-Error Linear Complexity 

for p n -Periodic Sequences 227 

Jean-Francis Michon, Pierre Valarcher and Jean-Baptiste Yunes 

HFE and BDDs: A Practical Attempt at Cryptanalysis 237 

Harald Niederreiter 

Digital Nets and Coding Theory 247 

Harald Niederreiter and Ferruh Ozbudak 

Constructive Asymptotic Codes with an Improvement 

on the Tsfasman-Vladut;-Zink and Xing Bounds 259 

Josef Pieprzyk and Huaxiong Wang 

Malleability Attacks on Multi-Party Key Agreement Protocols 277 

Charles C. Pinter 

Combinatorial Tableaux in Isoperimetry 289 

Yuansheng Tang 

On the Error Exponents of Reliability- Order- Based 

Decoding Algorithms for Linear Block Codes 303 

Xiaojian Tian and Cunsheng Ding 

A Construction of Authentication Codes with Secrecy 319 

Hitoshi Tokushige, Jun Asatani, Marc P.C. Fossorier and Tadao Kasami 
Selection Method of Test Patterns in Soft-Input and 

Output Iterative Bounded Distance Decoding Algorithm 331 

Yejing Wang , Luke McAven and Reihaneh Safari- Naini 

Deletion Correcting Using Generalised Reed-Solomon Codes 345 

Arne Winterhof 

A Note on the Linear Complexity Profile of 

the Discrete Logarithm in Finite Fields 359 

Huapeng Wu, M. Anwar Hasan and Ian F. Blake 
Speeding Up RSA and Elliptic Curve Systems 

by Choosing Suitable Moduli 369 

Gang Yao, Feng Bao and Robert H. Deng 

Security Analysis of Three Oblivious Transfer Protocols 385 

Gang Yao , Guilin Wang and Yong Wang 

An Improved Identification Scheme 397 




Preface 



It has long been recognized that there are fascinating connections between cod- 
ing theory, cryptology, and combinatorics. Therefore it seemed desirable to us to 
organize a conference that brings together experts from these three areas for a 
fruitful exchange of ideas. We decided on a venue in the Huang Shan (Yellow 
Mountain) region, one of the most scenic areas of China, so as to provide the 
additional inducement of an attractive location. The conference was planned for 
June 2003 with the official title Workshop on Coding, Cryptography and Combi- 
natorics (CCC 2003). Those who are familiar with events in East Asia in the first 
half of 2003 can guess what happened in the end, namely the conference had to 
be cancelled in the interest of the health of the participants. The SARS epidemic 
posed too serious a threat. 

At the time of the cancellation, the organization of the conference was at 
an advanced stage: all invited speakers had been selected and all abstracts of 
contributed talks had been screened by the program committee. Thus, it was de- 
cided to call on all invited speakers and presenters of accepted contributed talks 
to submit their manuscripts for publication in the present volume. Altogether, 39 
submissions were received and subjected to another round of refereeing. After care- 
ful scrutiny, 28 papers were accepted for publication. The selected papers cover a 
wide range of topics from coding theory, cryptology, and combinatorics and they 
contain significant advances in these areas as well as very useful surveys. 

We extend our cordial thanks to the international program committee con- 
sisting of A.R. Calderbank (USA), C. Carlet (France), C.S. Ding (Hong Kong), 
K.Q. Feng (China, co-chair), T. Helleseth (Norway), H. Imai (Japan), D. Jung- 
nickel (Germany), A. Klapper (USA), RV. Kumar (USA), S. Ling (Singapore), 
S.L. Ma (Singapore), J.L. Massey (Denmark/Sweden), H. Niederreiter (Singapore, 
co-chair), T. Okamoto (Japan), D.Y. Pei (China), J.Y. Shao (China), Z.X. Wan 
(China), and G.Z. Xiao (China). We are grateful to all referees of the manuscripts 
for their conscientious work and their valuable advice. We acknowledge with grati- 
tude the organizational and financial support that was provided by the University 
of Science and Technology of China in Hefei and the National Science Foundation 
of China. Special thanks go to Shoulun Long of USTC in Hefei. 




Preface 



viii 

Finally, we express our thanks to Birkhauser Verlag, and especially to Dr. 
Thomas Hempfling, for agreeing to publish this volume and for the advice and 
help we have received. 

January 2004 Keqin Feng 

Harald Niederreiter 
Chaoping Xing 




Invited Papers 




Progress in Computer Science and Applied Logic, Vol. 23, 3-28 
© 2004 Birkhauser Verlag Basel/Switzerland 



On the Secondary Constructions 
of Resilient and Bent Functions 

Claude Carlet 



Abstract. We first give a survey of the known secondary constructions of 
Boolean functions, permitting to obtain resilient functions achieving the best 
possible trade-offs between resiliency order, algebraic degree and nonlinear- 
ity (that is, achieving Siegenthaler’s bound and Sarkar et al.’s bound). We 
introduce then, and we study, a general secondary construction of Boolean 
functions. This construction includes as particular cases the known secondary 
constructions previously recalled. We apply this construction to design more 
numerous functions achieving optimum trade-offs between the three charac- 
teristics (and additionally having no linear structure) . We conclude the paper 
by indicating generalizations of our construction to Boolean and vectorial 
functions, and by relating it to a known secondary construction of bent func- 
tions. 

Keywords. Stream ciphers, Boolean function, correlation immunity, resiliency, 
nonlinearity, algebraic degree. 



1. Introduction 

Boolean functions are extensively used in stream cipher systems. Important nec- 
essary properties of Boolean functions used in these systems are balancedness, 
high-order correlation immunity, high algebraic degree and high nonlinearity. An 
n - variable Boolean function / : F£ •— » F 2 is called balanced if its output is uni- 
formly distributed over {0,1}. It is called mth order correlation immune if the 
distribution probability of its output is unaltered when any m of its input bits 
are kept constant. A balanced mth order correlation immune function is called 
m-resilient. The algebraic degree of an n-variable Boolean function equals the de- 
gree of its algebraic normal form (see Section 2), and its nonlinearity equals its 
Hamming distance to the set of all n-variable affine functions. 

Bounds exist, showing the limits inside which lie necessarily these character- 
istics for all Boolean functions: 

- Siegenthaler showed in [29] that any n-variable mth order correlation immune 
function (0 < m < n) has algebraic degree smaller than or equal to n — m, and 




4 



C. Car let 



that any n- variable m-resilient function (0 < m < n) has algebraic degree smaller 
than or equal ton — m — 1 if m < n — 1 and equal to 1 if m = n — 1. 



- Sarkar and Maitra showed in [28] a divisibility bound on the Walsh transform 
values of an n- variable, mth order correlation immune (resp. m-resilient) function, 
with m < n — 2: these values are divisible by 2 m+1 (resp. by 2 m+2 ). This provided 
a nontrivial upper bound on the nonlinearity of resilient functions (and also of cor- 
relation immune functions, but non-balanced functions present less cryptographic 
interest), independently obtained by Tarannikov [31] and by Zheng and Zhang 
[35]: the nonlinearity of any n- variable, m-resilient function is upper bounded by 
2 n— 1 — 2 m+1 . Tarannikov showed that resilient functions achieving this bound 
must have degree n — m — 1 (that is, achieve Siegenthaler’s bound); thus, they 
achieve best possible trade-offs between resiliency order, degree and nonlinear- 
ity. Moreover, they must be plateaued (see Section 2), which gives them a better 
chance of resisting the attack of Leveiller et al. [17]. For m < f -2, the upper 
bound 2 n_1 — 2 m+1 on the nonlinearity of m-resilient functions cannot be tight, 
since we know that the nonlinearity of any balanced n- variable function is strictly 
upper bounded by 2 n_1 — 2^ _1 . But thanks to the divisibility property due to 
Sarkar and Maitra, there exists then a better upper bound: if n is even and if 
m < - — 2, then the nonlinearity of any m-resilient function is upper bounded 
by 2 n_1 — 2t -1 — 2 m+1 . If n is odd, the bound is more complex (see [28]), but 
a potentially better upper bound can be given, whatever is the evenness of n : 
Sarkar-Maitra’s divisibility bound shows that f(a) = cp(a) • 2 m+2 where cp(a) is 
integer- valued. But Parseval’s relation J Z a eF? f ^(a) = 2 2n and the fact that f(a) 
is null for every word a of weight < m implies 

Y, <P 2 (a) = 2 2n ~ 2m_4 

a; WH(a)>m 



and thus 



max \<p(a)\ > 
aeF 2 n 



2 2n -2 



2 ” \/2"-E” 0 o 



Thus we have 



max |<^(a)| > 

aGF 2 n 



— rn — 2 

v/2" -£«(“) 



(where [A] denotes the smallest integer greater than or equal to A) and this implies 
that the nonlinearity of / is upper bounded by 



| V 2 ” - (”) I 

We shall call Sarkar et al ’ s bound the collection of all these upper bounds on the 
nonlinearities of m-resilient functions. 




Constructions of Resilient and Bent Functions 



5 



More recently, it has been shown in [5] that the Walsh transform values of 
n- variable, ra-resilient, degree d functions are divisible by 2 m+2+ ^ ™ 2 J. This 
provided more precise upper bounds on the nonlinearities of resilient functions 
(see also a further improvement in [8]). 

Constructions of Boolean functions possessing a good combination of all char- 
acteristics have been proposed in [6, 7, 19, 20, 24, 27, 28, 31, 32]. But the knowledge 
in this matter is still insufficient. For given n - and even for low values of n - we 
know very few (if any) n-variable Boolean functions achieving Siegenthaler’s and 
Sarkar et al.’s bounds. Knowing numerous such functions gives more chances to 
find, among them, functions satisfying additional constraints needed for applica- 
tions. Obviously, the known “good” Boolean functions on n variables will always 
be an insignificant proportion of the total number of functions; but the number 
of n-variable Boolean functions being huge, the number of known good functions 
can however become sufficiently large for a better use in applications. 

Functions achieving good characteristics can be obtained by using two kinds 
of constructions: the primary ones [1, 4, 6, 7], which directly give functions whose 
characteristics can be calculated (or at least can be lower bounded); and the sec- 
ondary constructions [19, 20, 24, 27, 28, 31, 32] which build n-variable resilient 
functions from n'- variable ones (with m! <n in general). The primary constructions 
could seem preferable, since they lead to potentially more numerous functions. Un- 
fortunately, the known primary constructions do not permit (except in extreme 
cases, which do not present a real cryptographic interest, see [6]), to build, alone, 
resilient functions in any numbers of variables, achieving Siegenthaler’s and Sarkar 
et al.’s bounds. They have been used, however, to design optimum functions in 
small numbers of variables, and they could also be modified (see, e.g ., [27, 28]) 
to lead to functions in larger numbers of variables, achieving good characteristics. 
But this often needed computer help and it is hardly generalizable. Fortunately, 
secondary constructions have been used successfully to design optimum functions. 
Elementary constructions have been combined into a nice construction, introduced 
by Tarannikov [31] and later studied and slightly modified by Pasalic, Maitra, 
Johansson and Sarkar [24], which permits to build an infinite sequence of such 
functions, cf. [31, 24, 19, 28, 32]. 

As we wrote above, these constructions lead, for given n, to very few n- 
variable functions achieving the bounds. In order to produce, for every n, more 
numerous such n-variable functions, we have either to find better primary con- 
structions (and this is an open problem), or to find more functions obtained from 
secondary constructions. The aim of the present paper is to give a general sec- 
ondary construction, including as particular cases the constructions cited above, 
and leading to many more functions achieving the bounds. 

The paper is organized as follows. In Section 2, we recall the necessary back- 
ground on Boolean functions. In Sections 3 and 4, we give a survey of the known 
elementary constructions, including the efficient combination of elementary con- 
structions introduced by Tarannikov and modified by Pasalic et al. (we call it 




6 



C. Car let 



Tarannikov et al.’s construction). In Section 5, we give a generalization of Taran- 
nikov et al.’s construction, which gives an explanation why this construction works 
so well; our generalized construction is simpler to understand, thanks to a nice 
symmetry property, and leads to a multiple branching infinite tree of functions, 
whereas Tarannikov et al.’s construction leads only to an infinite sequence. As 
Tarannikov et al.’s construction, our general construction uses pairs of functions 
achieving Siegent haler’s and Sarkar et al.’s bounds, and whose Walsh spectra are 
disjoint. In Section 6, we study in detail the problem of generating such pairs. We 
give a complete description of these pairs for high resiliency orders (m > n — 3). 
For the remaining resiliency orders, we give, for every d, examples of such pairs of 
degree d, for all but finitely many cases. In Section 7, we indicate generalizations 
of our construction to Boolean and vectorial functions. In Section 8, we study how 
our construction permits to design bent functions, and we relate it to a previous 
construction of bent functions. 



2. Preliminaries 

In this section we introduce a few basic concepts and results. By F 2 we denote the 
finite field GF( 2). The (Hamming) distance d(f,g) between two Boolean functions 
/ and g on F% (two n-variable Boolean functions) equals the size of the set {x E 
F £ / f(x) ^ g{x)}. The (Hamming) weight of / is its distance to the null function, 
that is the size of its support {x E f{x) = 1}. It is denoted by wt(f). 
An n-variable Boolean function / is balanced if wt(f) = 2 n_1 . The function / 
can be represented uniquely by a multivariate polynomial over F 2 of the form 
f(x) = X)/c{i n} a i dLe/^)’ ca ll e d its algebraic normal form. The degree 
of this polynomial is called the algebraic degree or simply degree of /, and it 
is denoted by d°f. The degree of any cryptographic function must be high (see 
[2,14,15,16,34]). 

The functions of degrees at most one are called affine functions. The set of 
all n-variable affine functions is denoted by A(n). Affine functions have constant 
derivatives D a f(x) = f(x) -f f(x + a). On the contrary, cryptographic functions 
have preferably no constant derivative D a f ( a ^ 0), that is no nonzero linear 
structure. This is a strong requirement in block ciphers (see [13]); in the case of 
stream ciphers, the existence of nonzero linear structures for a combining function 
or a filtering one, in pseudorandom generators, is a potential weakness (even if no 
attack using it has been found so far) that can preferably be avoided. 

The nonlinearity Nf of an n-variable function / is defined as 

N f= aun(d(/,g)), 

g£A(n) 

i.e., Nf is the distance between / and the set of all n-variable affine functions. 
It must be high (cf. [2, 21, 34]). An important tool for the analysis of Boolean 
functions is the Walsh transform , which we define next. The Walsh transform of 




Constructions of Resilient and Bent Functions 



7 



an n - variable function f(x i,...,x n ) is the real- valued function over F whose 
value at every a e Fg is defined as 

f (a) = (-l) /(x)+a ' x , 

x€F 2 n 

where a • x = a\X\ H -f a n x n is the usual inner product in F^. We have 

Nf = 2 n_1 - i max |f(a)|. (2.1) 

The nonlinearity of any Boolean function is upper bounded by 2 n_1 — 2%~ 1 (we 
shall call this bound the universal bound), due to Parseval’s relation ^2 aeF n f 2 (a) = 
2 2n . An n- variable function / is called bent if it achieves nonlinearity 2 n_1 — 2^ _1 
(this is possible only if n is even), which is equivalent to the fact that f (u) = ±2? 
for all -U € Ff . These functions (whose degrees can be at most equal to - , see 
[25]) present the best possible theoretical resistance to linear attacks, and in the 
same time to differential attacks, since a Boolean function / is bent if and only 
if (see [25]) all of its derivatives D a f(x ), a E F^* are balanced (we say that / 
satisfies then the propagation criterion of degree n, FC(n); more generally, / sat- 
isfies PC(£) for some integer £ if D a f is balanced for every nonzero vector a of 
Hamming weight at most £). But bent functions are never balanced, which makes 
them inappropriate for a cryptographic use. The class of bent functions is included 
in the class of plateaued functions , whose Walsh transform values all belong to a 
set of the form {0, A, — A} for some positive value of A, called the amplitude of the 
function. 

Correlation immune functions were introduced by Siegenthaler [29, 30], to with- 
stand a class of divide-and-conquer attacks on certain models of stream ciphers: 
we recall that a function f(x i, • • • , x n ) is mth order correlation immune if the dis- 
tribution probability of its output is unaltered when any m of its inputs are kept 
constant. Xiao and Massey [33] provided a spectral characterization of correlation 
immune functions. A function / is mth order correlation immune if and only if its 
Walsh transform f satisfies: f(tx) = 0, for 1 < wt(u) < m, where wt(u) denotes the 
Hamming weight of u. Notice that the two constant Boolean functions are mth 
order correlation immune, but they do not present interest from cryptographic 
point of view. Function / is balanced if and only if f(0) = 0. A balanced mth 
order correlation immune function is called m-resilient. The nonlinearity bounds 
recalled in the introduction are direct consequences of Relation (2.1), of Sarkar’s 
and Maitra’s divisibility bound and of the universal bound. The fact that any 
m-resilient function with nonlinearity 2 n_1 - 2 m+1 has degree n-m-1 can be 
deduced from the improved divisibility bound given in [5] ; any such function must 
also be plateaued because f(a) being divisible by 2 m+2 and having magnitude up- 
per bounded by this same number, according to Relation (2.1), it must equal 0 or 




8 



C. Car let 



By an (n, m, d, N) function we mean an n-variable, m-resilient function hav- 
ing degree d and nonlinearity N. In the above notation, we may replace some 
component by — if we do not want to specify it. 

3. The Known Elementary Constructions 

In this section, we gather the known results on secondary constructions, which 
have appeared in scattered ways in the literature. 

3.1. Direct Sums of Functions 

3.1.1. Adding a variable. Let / be an r-variable ^-resilient function. The Boolean 
function on F^ 1 : 

h(x i,...,x r ,x r +i) = f(x i,...,x r ) +x r + 1 

(the addition being obviously computed in F 2 ) is (t + l)-resilient [29]. If / is an 
(r,t,r — t — l,2 r_1 — 2 t+1 ) function, then h is an (r + 1, t + 1, r — t — 1, 2 r — 
2 t+2 ) function, and thus achieves Siegenthaler’s and Sarkar et al.’s bounds. This 
construction does not permit to increase the degree. Also, h has the linear structure 
(0,...,0,1). 

3.1.2. Generalization. If / is an r-variable t-resilient function and if g is an s- 
variable m-resilient function, then the function: 

h(x i,...,x r ,x r+ i,...,x r+s ) = f(xi ,...,x r ) +g(x r +i,---,x r +s) 

is (t + m + l)-resilient. This construction has been first introduced by Rothaus 
in [25], for generating bent functions. The resiliency property of h comes from 
the easily provable relation h(a, b) = f (a) x g(ft), Va e FJ , b G F 2 S . We have also 
d°h = max(d°/, d°g) and, thanks to Relation (2.1), Nh = 2 r+s-1 — \ (2 r -2N f)(2 s — 
2 Ng) = 2 r Ng + 2 s Nf - 2NfN g . Such function does not give full satisfaction (J. 
Dillon already explained in [12] that such decomposable functions have weaknesses). 
For instance, h has low degree, in general. And if Nf = 2 r ~ 1 - 2 t+1 and N g = 
2 S_1 — 2 m+1 , then Nh — 2 r+s_1 — 2 t+m+3 and h does not achieve Sarkar’s and 
Maitra’s bound (note that this is not in contradiction with the properties of the 
construction recalled at Subsection 3.1.1, since the function g(x r + 1) = x r +i is 
O-resilient, that is, balanced, but has nonlinearity greater than 2° — 2 1 ). 

Function h has no nonzero linear structure if and only if / and g both have 
no nonzero linear structure. 

3.2. Siegenthaler’s Construction 

Let / and g be two Boolean functions on F^. Consider the function 

h( xi,-- • ,x r ,x r+ i) = (x r+ i + l)/(xi,- • • ,x r ) + x r+ ig(xi,- ■■ ,x r ) 

on F, 2 +1 . Note that the truth-table of h can be obtained by concatenating the 
truth-tables of / and g. We have: 

h(ai , . . . , a r , Ur+i ) — f (ui 5 • • • 5 a r ) ~h ( 1 ) + s(^i > • • • > ) • 




Constructions of Resilient and Bent Functions 



9 



Thus: 

1. If / and g are m-resilient, then h is m-resilient [29] (since the vector (cq , . . . ,a r ) 

has Hamming weight smaller than or equal to the vector (oq, . . . , a r +i)); 
moreover, if for every a G F 2 r of Hamming weight wt(a) = m + 1, we have 
f(a)+g(a) = 0, then h is (m-l-l)-resilient. Note that the construction recalled 
at Subsection 3.1.1 corresponds to g = f + 1 and satisfies this condition. 
Another possible choice of a function g satisfying this condition is g(x) = 
f(x i + 1, . . . ,x r + 1) + e, where e = m mod 2 (it was first pointed out in 
[1]), since g(a) = l)/( x )+ e +( x +( 1 ’ --d))-a — (— l) e+ti;t ( a )f(a). It leads 

to a function h having also a nonzero linear structure (namely, the all-one 
vector) ; 

2. The number max aij ... >ar+l€ F 2 |h(ai, . . . ,a r , a r +i)| is clearly upper bounded 
by maXa lv .. >ar€ F 2 |f(«i, • • • , a r )\ + max ai ,... >ar€ F 2 |g(cq, . . . , a r )|; this implies 
the inequality 2 r+1 — 2 Nh < 2 r+1 — 2 Nf — 2N g , that is Nh > Nf + N g ; 

a. if / and g achieve nonlinearity 2 r ~ 1 — 2 m+1 and if h is (m + l)-resilient, 
then the nonlinearity 2 r — 2 m+2 of h is the best possible; 

b. if / and g are such that, for every word a, at least one of the numbers 
f(a), g(a) is null (in other words, if the supports of the Walsh transforms 
of / and g are disjoint), then the number 

max |h(ai,... , CLj' , CLqr-^- ^ ) | 

ai,...,a r+ iGF 2 

is equal to 

maxi max |f(ai, . . . ,a r )|; max |g(ai, . . . , a r )\ ) . 
yai ,...,a, r €.F2 ai,...,a r (EiF2 J 

Hence, we have 2 r+1 — 2^ = 2 r -2min(Afy, N g ) and Nh equals therefore 
2 r_1 + mm(Nf,N g )i thus, if / and g achieve nonlinearity 2 r ~ 1 — 2 m+1 
then h achieves best possible nonlinearity 2 r — 2 m+1 ; 

3. If the monomials of highest degree in the algebraic normal forms of / and g 
are not all the same, then d°h = 1+ma x(d°/, d°g). Note that this condition is 
not satisfied in the two cases indicated above in 1, where h is (m+l)-resilient. 

4. For every a — (ai,...,a r ) G F£ and every a r+ i G F 2 , we have, denoting 
(xi,...,x r ) by x: 

D( a ,a r - ^r+l) 

= D a f(x) + a r+ i(/ + g){x) + x r +iD a (f + g){x) + a r+ iD a (/ + g)(x). 

If d°(f + g) > d°f and if there does not exist a ^ 0 such that D a f and 
D a g are constant and equal to each other, then h admits no nonzero linear 
structure (this is in fact a particular case of Corollary 5.2 below). 

This construction permits to obtain: 

- from any two m-resilient functions / and g having disjoint Walsh spectra, 
achieving nonlinearity 2 r_1 — 2 m+1 and such that d°(f + g) = r — m — 1, an 




10 



C. Car let 



m-resilient function h having degree r — m and having nonlinearity 2 r - 2 m+1 , 
that is, achieving Siegent haler’s and Sarkar et al.’s bounds; note that this 
construction increases (by 1) the degree; 

- from any m-resilient function / achieving degree r — m — 1 and nonlinearity 
2 r ~ 1 — 2 m+1 , a function h having resiliency order m + 1 and nonlinearity 
2 r _ 2 m+2 , that is, achieving Siegenthaler’s and Sarkar et al.’s bounds and 
having same degree as f (but having nonzero linear structures). 

So it permits, when combining these two methods, to keep best trade-offs between 
resiliency order, degree and nonlinearity, and to increase by 1 the degree and the 
resiliency order. 



3.3. Tarannikov’s Elementary Construction 

Let / be any Boolean function on F 2 r . Define the Boolean function h on F 2 r+1 by 
h(x i, . . . ,x r ,x r +i) = x r+i +/(a?i, • . . ,x r _i,x r + x r +i). For every (ai, . . . ,a r +i) G 
F^ 1 , if we denote (ai, . . . , a r _i) by a and (xi, . . . , x r _i) by x, then h(ai, . . . , a r +i) 
is equal to 



£ (-ir 



)4-a r (^r+a: r -|_i ) + (a r -fi -+-l)x r -(-i 



xi,...,x r+ i6F 2 



\a-x+/(xi,...,x r )+a r x r + (a r +a r +i + l):c r +i . 



,X r -)-l€F2 



it is null if a r+ i = a r and it equals 2 f(a) if a r+ i — a r + 1. Thus: 

1. N h = 2 N f m , 

2. If / is ^-resilient, then h is ^-resilient (since the vector (ai, . . . ,a r ) has Ham- 
ming weight smaller than or equal to the vector (ai, . . . , a r +i)). If, addition- 
ally, f(ai, . . . , a r _i, 1) is null for every vector (ai , . . . , a r -\) of weight at most 
t , then h is (t + l)-resilient (since the only case where h(ai, . . . , a r+ 1 ) may be 
nonzero is when a r+i = a r -f 1 = 0); note that, in such case, if / has nonlin- 
earity 2 r_1 - 2 t+1 then the nonlinearity of h , which equals 2 r - 2 <+2 achieves 
then Sarkar et al.’s bound (and, hence, Siegenthaler’s bound). The condi- 
tion that f(ai, . . . , a r _i, 1) is null for every vector (ai, . . . , a r _i) of weight at 
most t is achieved if / does not actually depend on its last input bit; but the 
construction is then a particular case of the construction recalled at Subsec- 
tion 3.1.1. The condition is also achieved if / is obtained from two ^-resilient 
functions, by using Siegenthaler’s Construction (recalled at Subsection 3.2). 

3. d°h = d°f if d°f > 1. 

4. h has the nonzero linear structure (0, . . . , 0, 1, 1). 



Tarannikov combined in [31] this construction with the constructions recalled at 
Subsections 3.1 and 3.2, to build a more complex secondary construction, which 
permits to increase in the same time the resiliency order and the degree of the func- 
tions and which leads to an infinite sequence of functions achieving Siegenthaler’s 
and Sarkar et al.’s bounds. Increasing then, by using the construction recalled at 




Constructions of Resilient and Bent Functions 



11 



Subsection 3.1.1, the set of ordered pairs (n,m) for which such functions can be 
constructed, he deduced the existence of n - variable m-resilient functions achieving 
Siegenthaler’s and Sarkar et al.’s bounds for any number of variables n and any 
resiliency order m such that m > and m > f - 2. Pasalic et al. slightly 

modified this more complex Tarannikov’s construction in [24], into a construction 
that we call Tarannikov et al. ’s construction , which permitted, when iterating it 
together with the construction recalled at Subsection 3.1.1, to relax slightly the 
condition on m into m > - -~ 1Q and m > ^ — 2 (the use of Construction 3.1.1 
gives then functions with nonzero linear structures). We describe precisely this 
construction at Section 4. 

3.4. Maiorana-McFarland’s Construction 

We use, at Section 6, a primary construction of resilient functions called Maiorana- 
McFarland’s construction, introduced in [1] (and later studied in [4, 6, 10, 11]): let 
m, si and S 2 be positive integers such that si > m; let g be any Boolean function 
on F£ 2 and 0 a mapping from F 2 2 to F? 1 such that every element in 0(F| 2 ) has 
Hamming weight strictly greater than m. Then the function: 

/<A,g(z.y) = x ■ 4>(y) + g(y), x € F 2 Sl , y£ F 2 2 (3.1) 

is m-resilient. Indeed, for every a G F 2 * and every b G F| 2 , we have 

Q«,») = 2 S1 £ (3.2) 

ye<f>~ 1 (a) 

since every (affine) restriction of / to a coset of F 2 1 : x i— » f<j>, g (x, y) + a • x + b • y 
either is constant or is balanced on F ^ 1 , in which last case it contributes for 0 in 
the sum Z^xeF 2 sl ,yeF 2 S2 ( — a+y b . Note that if </> is injective, then 
has nonlinearity 2 Sl+S2_1 — 2 Sl-1 , and that two such functions corresponding to 
two mappings (j>\ and such that 0i(F| 2 ) D 02 (F 2 2 ) = 0 have disjoint Walsh 
supports. 

This construction of resilient Maiorana-McFarland’s functions is an adap- 
tation of a construction of bent functions (see [12]): if n is even, if n is a per- 
mutation of F^ 2 and if g is a Boolean function on F ^ 2 , then the function 
f(x,y) = X ■ n (y)+g(y), x,y e F 2 ” /2 , is bent. 

4. Tarannikov et al.’s Construction 

In this section, we consider the construction introduced in [31], and modified in 
[24] . We call it Tarannikov et al. ’s construction. Let us first present it as this has 
been done in [24]. It uses two (n — l,t,d - l,2 n ~ 2 — 2 t+1 ) functions f\ and fa to 
design an (ra + 3, £ + 2, d+ 1, 2 n+2 - 2 t+3 ) function h, assuming that /1 + /2 has also 
degree d— 1 and that the supports of the Walsh transforms of fi and are disjoint. 
The two restrictions hi(xi, . . . ,x n+ 2 ) = h(x 1, . . . ,x n+ 2,0) and /i 2 (^i, • • • , x n+ 2 ) = 
h(x 1, . . . , x n+ 2 , 1) have then also disjoint Walsh supports, and these two functions 
can then be used in the places of fi and (all these properties will be proved 




12 



C. Car let 



again below) . This permits to generate an infinite sequence of functions achieving 
Sarkar et al.’s and Siegent haler’s bounds. 

Remark 4.1. As observed in [19, 24], the assumption that the supports of the 
Walsh transforms of /i and / 2 are disjoint is equivalent to the assumption that 
the function (1 + x n )f\ -f x n f 2 has nonlinearity 2 n_1 — 2 t+1 (if they are not, then 
(1 + x n )fi -f x n f 2 has nonlinearity 2 n_1 — 2 t+2 ). 

Also, the assumption that fi + f 2 has degree c? — 1 is equivalent to the as- 
sumption that this same function (1 + x n )/i + x n f 2 has degree d . o 

The function h is defined by the relations 

f(x i,...,x n ) = (1 + £ n )/i(xi,...,£ n _i) + x n f 2 (x 1 , — ,a: n _i), 

F(x\ , . . . , X n -^- 2 ) 2 "h *^n+l /* (*^1 ? • • • •> *£n)j 

(j(^i , . . . , X n _(-2) (1 4“ *^n+2 4“ X n +\)f\ {x 1 , . . . , X n —\ ) 

+ {Xn+2 + X n +i)f 2 (xi > • • • 5 ^n-l) + X n+2 + 



and 



h(xi , . . . , x n+3 ) = (1 + x n+2 )F{xi , . . . , x n+2 ) + 2:n+ 3 G(xi , . . . , x n + 2 ). 

If we translate this definition of h into a single formula, we obtain that 
h(x 1, . . . ,x n+3 ) equals: 

5 • • * 5 ^n-f -3 ) "F l)/l(*i,..., X n — \ ) -f - q(Xyi , . . . , 3 ?n -|-3 ) f 2 (x \ , . . . , X n — \ ) 

+ 9'( x m • • • 7 ^71+3)5 

where the functions g and g' are defined by g(x) — X1X4 x 2 x 4 x 3X4 + x\ and 

g f (x) = X1X4 + x 2 X4 + x 2 + X3. This can be checked by a direct calculation. It can 
also be deduced by considering the truth-table of h. By definition, this truth-table 
can obtaijied by concatenating the truth-tables of the functions 

/l,/2,/l,/2,/l,/2,/l,/2,/l,/l,/2,/2,/2,/2,/l &nd /l- 

This implies that the function h(x 1, . . . ,x n+3 ) is equal to the function 

(ff(x n , • • • , x n + 3 ) + 1) fi(x u . . . ,x n _i) 

+ . . . , x n+3 ) f 2 (x 1 , . . . , X n — 1 ) + #'( 

7 • • • 5 ^n+3) ? 

where the two 4 -variable Boolean functions g and g' have truth-tables given in 
Table 1 . 

The ANF of g can easily be deduced from its truth- table and equals X1X4 + 
x 2 X4 + X3X4 + the ANF of g' equals X1X4 4 - x 2 X4 + x 2 + X3. Thus we have 
h(x l,...,X n+3 ) = (1 + X n X n+3 + X n +\X n+3 + Xn +2 X n +3 + X„) fl(x U . . .,X n -i) + 
(^n*^n+ 3 ^-n+l^'n-j-3 “ 1 “ ^n+2^n+3 “t" *^n) i • • • 5 X n — 1 ) " 1 " ^n^n+3 “I - 3-n+l *^n+3 

^n+1 + X n 4- 2 . 




Constructions of Resilient and Bent Functions 



13 



Xi 


X2 


%3 


£4 


9(x) 


9'{x) 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


1 


0 


0 


1 


0 


0 


0 


1 


1 


1 


0 


0 


1 


1 


0 


0 


1 


0 


0 


1 


1 


0 


1 


0 


1 


1 


0 


1 


1 


0 


0 


0 


1 


1 


1 


0 


1 


0 


0 


0 


0 


1 


0 


0 


1 


0 


0 


1 


0 


1 


0 


1 


0 


1 


1 


0 


1 


1 


0 


1 


1 


1 


0 


0 


1 


1 


1 


1 


1 


0 


1 


1 


1 


0 


0 


1 


1 


1 


0 


1 


1 


1 


1 


1 


0 


0 



Table 1 



In order to find an explanation of the nice properties of Tarannikov et al.’s 
construction, let us first calculate the Walsh transforms of g, g f and g" = g + g ' • 
They are given in Table 2. 

We observe that g is balanced but that it is not 1-resilient. On the contrary, 
we see that g ' and g n are both 1-resilient. Moreover, the supports of their Walsh 
transforms are disjoint and they achieve Siegenthaler’s and Sarkar et al.’s bounds. 
We also observe that the functions |g'(x)| and |g"(x)| are invariant under the 
mapping x x + (0, 0 , 0 , 1) (that is, they do not depend in fact on £4). 

Let us calculate the Walsh transform of h. To simplify the notation, we 
shall denote (xi, . . . , £ n -i) by x and (x n , . . . , x n +z) by y. Thus we have h(x , y) — 
(g(y) + 1) fi(x) +g{y)f 2 {x) + g'{y), X e F 2 n_1 ,y G F 2 4 . The Walsh transform of h 
equals: 

h(a,b) — ^2 (-l) 9 ' iv)+by ( (-l) /l(x)+ax 

y€F*/g(y)= 0 \ xe p^~ 1 

= fi(«) E E ( _iy'(y)+b-y 

y€F£/g(y )= o y€F}/g(y)=i 



+ £ (_1)9'(V)+^ ^ 

y€F*/g(y) = 1 \x6F 2 n 





14 



C. Car let 



Xl 


X 2 


X 3 


£4 


g(z) 


g'(z) 


g"(x) 


0 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


8 


0 


0 


0 


1 


0 


0 


0 


0 


0 


1 


1 


0 


0 


0 


0 


8 


0 


0 


1 


0 


0 


0 


0 


1 


0 


1 


0 


0 


8 


0 


0 


1 


1 


0 


8 


8 


0 


1 


1 


1 


0 


0 


0 


8 


0 


0 


0 


1 


0 


0 


0 


1 


0 


0 


1 


8 


0 


0 


0 


1 


0 


1 


0 


0 


0 


1 


1 


0 


1 


0 


0 


-8 


0 


0 


1 


1 


0 


0 


0 


1 


0 


1 


1 


0 


-8 


0 


0 


1 


1 


1 


-8 


8 


0 


1 


1 


1 


1 


0 


0 


8 



Table 2 



= S(a) EM) 9 ' 

y€F* 



{y)+b .y ( l + (~l) g(y) 
2 



)+S(a) E("l) 9 ' 

' S/€F* 



(»)«■» ^-(- l ) 8 "' 1 
2 






5 , W+g ,, W 



+ 



^(«) [g'W 



-g"W 



The functions g f and g" being 1-resilient and the functions /i and being t- 
resilient, we can see that h is (t + 2)-resilient (indeed, if (a, b) E F£~ l x F 2 has 
weight at most t + 2 then either a has weight at most t or b has weight at most 1). 
The degree of h clearly equals dT 1 since / 1 + / 2 has degree d— 1 and since g has de- 
gree 2, and we have max (a 6)gF „-i xF 4 |h(a, 6)| = 4 (max ie{li2 } max a€F n-i |h(a)|), 
since the supports of the Walsh transforms of g ' and g" are disjoint, as well as 
those of fi and / 2 , and since |g'(6)| and |g"(6)| equal 0 or 8, for every b E F£. 
Notice that g does not play a direct role above (except for its degree): only g' 
and g" play roles. Moreover, if we denote g' and g " by g\ and g2 {g becomes then 
9i + 92), a nice symmetry between (/i,/ 2 ) and (< 71 , # 2 ) appears, and this leads to 
Theorem 5.1. This theorem and Proposition 5.1, by exhibiting a simple and more 
general framework for Tarannikov et al.’s construction, give an explanation of the 
nice properties of this construction. 





Constructions of Resilient and Bent Functions 



15 



5. A Generalization of Tarannikov et al.’s Construction 



Theorem 5.1. Let r, s, t and m be positive integers such that t < r and m < s. Let 
fi and f 2 be two r-variable ^-resilient functions. Let g\ and <72 be two s-variable 
m-resilient functions. Then the function 



h(x, y) = fi(x) + gi(y) + (/1 + f 2 )(x) ( 9l + g 2 )(y), x e F£,y e F 2 S (5.1) 

is an (r + s)-variable (t + m + l)-resilient function. If /1 + /2 and g\ + g 2 are 
non-constant, then the algebraic degree of h equals max(d°/i ? d°gi,d°(fi + f 2 ) + 
d°(gi 4- g 2 )). The value of the Walsh transform of h at (a, b) G FJ x F| equals 

h(a,6) = ^fi(a) [gi(5) + g 2 {b)} + ^f 2 (a) [gi (6) - g 2 (b)] . (5.2) 

This implies 

N h > -2 r +‘~ l + 2 s (Nf 1 + N h ) + 2 r (N gi + N 92 ) - (N fl + N h )(N gi + N g2 ). (5.3) 

Moreover, if the Walsh transforms of g\ and g 2 have disjoint supports, then, de- 
noting by / the function /(x,x r + 1) = (1 + x r +\)f\(x) + x r +if 2 (x), we have 

N h >2 s - 1 N f + (2 r -Nf) min N Qi . (5.4) 

J J i€{l,2} 

If, additionally, the Walsh transforms of /i and f 2 have disjoint supports, then 
N h = _ nun (2 r+s_2 + 2 r ~ 1 N gj + 2 s " 1 JV /j - % N gj ) . (5.5) 

Proof. We have: 



h(a, b) 



E 



xGF 2 r 



(-1) 






+ £ 

y^ F 2/^i+^2(y)=i 



xeF; 



(-i) 



pi(2/)+fe-y 



fi(«) E (-l)*<»>+ fc -* + S(a) £ 

y€F 2 s /3i+p 2 (y)=0 y£F£/ 01+02(2/)=! 



= S(«) ^ (-!)».(»>+»■» (1+tl^f^ 

+sw j: (-!)»<«»*» ( i - ( - 1 f +,,)W ) . 



We deduce h(a, 6) = |fi(a) [gi (6) + g 2 (6)] + ^(a) [gi (6) - gi(6)], that is, Relation 
(5.2). 

If (a, b) has weight at most t + m + 1 then a has weight at most t or b has 
weight at most m; hence we have h(a, b) = 0. Thus, h is t + m + 1-resilient. 




16 



C. Car let 



If fi + /2 and gi + <72 are non-constant, then the algebraic degree of h equals 
max(d°/i, d°gu d°(fi + /2) + d°(gi 4- <72)) because the terms of highest degrees 
in (<71 -f g 2 )(y) ( fi + f 2 )(x), in fi(x) and in gi(y) cannot cancel each others. We 
deduce from Relation (5.2): 



max 

(a,fe)GF 2 r xF 2 s 



|h(a, fe)| < 1 max|fi(a)| ( max|gl(6)| + max|g^(6)| 

Z a£FI 9 



6€F 2 s 



b£F~ 



2 aEFZ- 



+ o m ^ x l f 2( a )l ( max|gi(6)| + max|g(6)| , 



b€F, 



6€F 9 s 



that is, using Relation (2.1): 



2 r+ * - 2 N h < -(2 r - 2 N fl ) ((2 s - 2 N gi ) + (2 s - 2N 92 )) 

+ ±(2 r - 21 N h ) ((2 s - 2N gi ) + (2 s - 2 N 92 )) , 

or equivalently Relation (5.3). If the Walsh transforms of g\ and g 2 have disjoint 
supports, then Relation (5.2), which can be rewritten 

Ha,b) = ^g!(b) fi(a)+f 2 (a) + ^g 2 (6) fi(a)-f 2 (a) 

= ^giW[f(a,0) +^g 2 (6) ?(a,l) , 



implies: 



max |h(a,6)| < - max |f(u)| x max ( max|gi(6)| 
V - ° --+i' V *€{1,2} \b€F{' S K n 



(a,b)€F-xF^ 2 ue p, 

that is, using Relation (2.1): 

2 r+s - 2 N h < i(2 r+1 - 2Nf) x max (2 s - 2N Qi ), 

2 J *€{ 1 , 2 } 

or equivalently, Relation (5.4). If the supports of the Walsh transforms of /i and 
/ 2 are disjoint, as well as those of </i and <? 2 , we deduce also from Relation (5.2) 
that 

, Ha,b)\ = - max ( max |f)(a)| max |gj(6)| ) . (5.6) 

(a,6)€F 2 r x F 2 2 2 ,jG{ 1,2} yaGF^ 6eF 2 J 

Using Relation (2.1), we deduce 

2 r+s - 2N h = \ max ((2 r - 2N St )(2 s - 2 N 9j )) , 

which is equivalent to Relation (5.5). □ 

Relations (2.1) and (5.6) (as well as Relations (5.5) or (5.4)), imply: 



Corollary 5.1. Let fi and fa be two (r, t, — ,2 r_1 — 2 t+1 ) functions with disjoint 
Walsh supports and such that fi -1- fa has degree r — t — 1 . Let g\ and g 2 be two 
(s, m, — , 2 S ~ 1 —2 rn+1 ) functions with disjoint Walsh supports and such that gi+g 2 
has degree s — m — 1. Then the function h(x , y) — f\ (x) + g\ ( y ) + (/i + f 2 ){x) ( g\ + 




Constructions of Resilient and Bent Functions 



17 



92)(y) is an (r + 5, t + m + 1, r + s - 1 - m - 2, 2 r+s_1 - 2 t+m+2 ) function. Hence, 
it achieves Siegenthaler’s and Sarkar et al.’s bounds. 

Remark 5.1. The elementary secondary constructions, recalled in Section 3, are 
particular cases of our construction; Rothaus’ construction (see Subsection 3.1.2) 
corresponds to fi = f 2 or #i = g 2 , Siegenthaler’s construction corresponds to 
s = 1, gi = 0 and # 2 ( 2 / 1 ) = # 1 ; Tarannikov’s construction does not seem to enter 
in our framework, in general; but it does in the particular situations in which it is 
actually applied by Tarannikov. 

Remark 5.2. If t < | — 2 (r even) and m = | — 1 (s even), if fi and f 2 are two 
(r, £, — , 2 r_1 — 22 -1 -2 t+1 ) functions (achieving Sarkar et al.’s bound) with disjoint 
Walsh supports and if #1 and g 2 are two (s,ra, — , 2 s-1 — 2 m+1 ) functions with 
disjoint Walsh supports, then h is an (r + s,£ + m+ 1, -, 2 r + s_1 -2^ -1 -2 m_H+2 ) 
function (achieving Sarkar et al.’s bound if t + m + 1 < — 2), according to 

Relations (2.1) and (5.6) and to the equality \ (2 2 + 2*+ 2 ) 2 m+2 — 2 ^ +2 m+t+3 . 

Proposition 5.1. Under the hypothesis of Theorem 5.1, let us assume that #1 
and #2 are plateaued with the same amplitude (this is the case if g\ and g 2 are 
(s,m, s—m— 1, 2 S_1 — 2 m+1 ) functions), and that the Walsh transforms of fi and f 2 
have disjoint supports, as well as #1 and g 2 . If the union of the supports of gi and g 2 
is invariant under the mapping # 1 — >#+(0,...,0,l), then the Walsh transforms of 
the restrictions h^x.y) = h(x, # 1 , . . . , # s _i, 0) and h 2 (x,y) = h(x, # 1 , . . . , # s _i, 1) 
of h have disjoint supports. 

Proof. For every i E {1,2}, we have: 

hi(a, b u . . . , bs-i) = i (h(a, b u ... , b s - u 0) - (-1)%, b u ..., 6 s _i, 1)) , (5.7) 

and we know, according to Relation (5.2), that the numbers h(a,&i, . . . ,6 s _i,0) 
and h(a, &i, . . . , 6 S _ i, 1) are then either equal to each other or opposite. Thus, at 
most one of the values hi (a, &i, . . . , b s - 1 ) and h 2 (a, iq, . . . , b s - 1 ) can be nonzero. 
Hence, the supports of the Walsh transforms of hi and h 2 are disjoint. □ 

Note that these two restrictions of h are related to the corresponding restrictions 
of #1 and g 2 through Relation (5.1), just as h is related to g\ and g 2 . We deduce 
that, if h has been obtained by Corollary 5.1, then h\ and h 2 satisfy the hypothesis 
for fi and f 2 in this same corollary , with r 4- s — 1 in the place of r, and t + m + 1 
in the place of t. Corollary 5.1 can then be applied again, with hi and h 2 instead 
of fi and f 2 . This leads to infinite sequences of functions. This has been observed 
by Pasalic et al. in [24], in the particular case of their construction. But what is 
new in the present paper is the symmetry between /i, f 2 and #i, g 2 . Starting, at 
least, with two pairs {/i,/ 2 } and {gi,g 2 } of functions satisfying the hypothesis of 
Corollary 5.1 and such that the union of the supports of fi and f 2 is invariant under 
the mapping x > x + (0, . . . , 0, 1) and the union of the supports of gi and g^ is 
invariant under the mapping y i— » y + (0, . . . , 0, 1), we get a whole infinite multiple 




18 



C. Car let 



branching tree of functions, instead of a simple infinite sequence. This gives more 
values of n and t for which (n, t,n — t — 1, 2 n_1 — 2 t+1 ) functions (having no nonzero 
linear structure, see below) can be obtained. It gives also, for every value of n and 
t such that t > and t > | — 2, many more (n,t,n — t — l,2 n_1 — 2 t+l ) 

functions obtained thanks to the combination with the elementary construction 
“adding a variable”, than Tarannikov et al.’s construction, combined with this 
same construction, does (see [31]). 

Moreover, as we prove in the next proposition, the functions obtained with 
the generalized construction can have no nonzero linear structure. 

Proposition 5.2. Let fi and fc be two r-variable Boolean functions. Let g\ and #2 
be two s-variable Boolean functions. 

1. If /i -\- f 2 or g\ +g2 is constant and if fu f 2 , gi and g2 admit no nonzero linear 
structure, then the function h(x, y ) = fi(x)+gi(y) + (/i + f 2 ){x) (gi +02X2/) 
has no nonzero linear structure. 

2. If fi + f 2 and g\ + g2 are non-constant and if the two following conditions 
are satisfied, then h has no nonzero linear structure: 

a. for every a, the function /1 + /h + D a f\ is non-constant and for every 
6, the function g\ + #2 + Db 9 i is non-constant; 

b. There does not exist a ^ 0 such that D a fi and D a f2 are constant and 
equal to each other; there does not exist b ^ 0 such that D b gi and D b g2 
are constant and equal to each other. 

Proof. For every nonzero (a, b) G F 2 r x J F|, we have D^ ab )h{x,y) = D a f\{x) + 
D b gi{y ) + ( gi + gi){y)D a (fi + f 2 )(x) + (/i + f 2 )(x)D b (gi + 52) (y) + D a (fi + 
f 2 )(x)D b (gi + g 2 )(y). 

1. If /1 + f 2 is null then D (a b) h(x, y) = D a f\{x) + D b gi(y) and if /1 + f 2 equals 
the constant function 1, then D^ a b ^h(x,y) = D a fi(x) -h D b g2{y)\ in both cases, 
/i , /2, gi and g2 admitting no nonzero linear structure, D^ a b ^h is not constant. 
Similarly, if g\ + g 2 is constant, then D^ a ^h(x,y) is not constant. 

2. The terms of highest degree with respect to x in (/i + f2)(x)D b (gi +02X2/), an( l 

the terms of highest degree with respect to y in D a (fi T f2){x){gi + 92) (y), can 
be all cancelled in D^ a ^h{x,y) only if D a (fi + /2) and D b (gi +^2) are constant 
- say D a (fi + f 2) = e and D b (gi + g 2 ) = rj - in which case D^ b )h(x,y) equals 
[D a fi(x) + g{fi T f2)(x)] + [D b gi(y) + e(gi + g 2 ) {y)] + eg. If c = 1 or 77 = 1, then 
Condition a completes the proof. Otherwise, e = 77 = 0 and Condition b completes 
the proof. □ 

Since, if f\ + /2 is non-constant and has degree at least equal to d°/i, the 
function /1 + f 2 + D a fi cannot be constant ( D a f\ having at most degree d° f\ — 1), 
we deduce: 

Corollary 5.2. If d°(f\ + / 2 ) > d°fi > 1 and d°(g\ + g 2 ) > d°g\ > 1, and if fi and 
f 2 have not a same nonzero linear structure, as well as g\ and #2, then h admits 
no nonzero linear structure. 




Constructions of Resilient and Bent Functions 



19 



Remark 5.3. We do not have to assume in the second alinea of Proposition 5.2 
or in Corollary 5.2 that the functions /i, /2, g\ and <72 have no nonzero linear 
structure. This is why Tarannikov et al.’s construction, which is a combination 
of elementary constructions producing functions having nonzero linear structures, 
permits to design functions having no nonzero linear structure. 



6. Examples of Pairs of Functions Satisfying the Hypothesis of 
Corollary 5.1 and the Additional Condition of Proposition 5.1 

We are looking for pairs of functions (which can indifferently play the role of 
{/i ? /2} or the role of {<71, #2}) satisfying the hypothesis of Corollary 5.1, (and 
having preferably no nonzero linear structure). We want them to lead to infinite 
classes of functions, and thus, to satisfy the condition of Proposition 5.1. So we 
seek pairs {<71, <72} of (s, m, s — m — 1, 2 S ~ 1 — 2 m+1 ) functions with disjoint Walsh 
supports, such that g\ 4- <72 has degree s — m — 1 and such that the union of the 
supports of gi and §2 is invariant under the mapping y i-> y - f (0,...,0,1). We 
reduce ourselves to the cases m < s — 2 to avoid the degenerate situation in which 
gi + #2 is constant. We shall be able to give a complete description of all pairs for 
m = s — 2 and for m = s — 3. For m < s — 4, we shall give examples valid for all 
but finitely many cases, but the complete classification is open. 

6.1. The Case m — s - 2 

For every s > 2, two (s, s — 2, 1,0) functions g\ and <72 such that g\+g 2 has degree 1 
and with disjoint Walsh spectra are two affine functions gi(x) = u-x+e and <72(2) = 
v • x + 77 where u and v are distinct and have weights at least s — 1. This leads to the 
secondary construction C/1,/2) ^ fi{x) + u-y + e + (f 1 +f 2 ){x)((u + v)-y + e + ri). 
It is possible to have additionally that the union of the supports of gi and g2 is 
invariant under the mapping y y + (0, . . . , 0, 1): the supports of these Walsh 
transforms are {u} and {1;}; so we just have to take u + v = (0, . . . , 0, 1). But, we 
have then essentially Siegenthaler’s construction. 

Clearly, the functions <71 and <72 do not satisfy the conditions of Proposition 

5.2. and indeed, h admits as linear structure any vector (0, b) such that b E (iH (-v)" L , 
since we have D (0 ^ b) h(x,y) = u • b + (fy + / 2 )(x)((u + v) • b). 

So taking m = s — 2 is not completely satisfactory. 

6.2. The Case m = s — 3 

We determine now the pairs of (s, s — 3, 2, 2 s-1 — 2 s-2 ) functions gi and <72 such 
that <71 + <72 has degree 2 and with disjoint Walsh spectra: for every i = 1,2, 
gi having nonlinearity 2 s-2 , there exists an affine function i t such that gi 4- fy 
has weight 2 s-2 and we know (cf. [18]) that, being quadratic, gi 4- ti is then the 
indicator of an ( s — 2)-dimensional flat. Without loss of generality, set gi(x) = 
(ai • x)(bi • x) 4- Ci • x + €f, where and bi are linearly independent (i = 1,2). We 
have then (see for instance [7]) gi(u) = 0 if a^, bi and u-\-Ci are linearly independent; 
and we have |gi(u)| = 2 s-1 otherwise (that is, if u G Ci 4- (a*, 6*) where (a^,^) is 




20 



C. Car let 



the vector space spanned by and b{). The supports of gi and g2 are disjoint if 
and only if (ci + (ai, &i)) H ( c 2 4- (a 2 , 62)) — 0 (which is equivalent to saying that 
the function (1 + x s +i)gi(x) 4- x s +ig 2 (x) belongs to the class Q\ introduced in 
[7]). And g\ and g 2 are (s — 3)-resilient if and only if the flats c\ 4- (ai,&i) and 
C2 4- (a 2 , 62) have minimum weights at least s — 2. 

There are only two situations in which the condition of Proposition 5.1 (i.e., 
the union of the supports of gi and §2 is invariant under the mapping y y + 
(0, 1)) is also satisfied: either we have (a\,bi) = (^2^2) (so that the set 

(ci 4- (ai,bi)) U (C2 4- (a 2 , 62)) is a flat) and (0, . . . , 0, 1) G (ci + C2, a\, b\) (since 
(ci 4- C2, ai, b\) is the direction of this flat), or the vector (0, . . . , 0, 1) belongs to 
(ai,bi) and to (a 2 ,6 2 ). 

• If (ai,6i) = (a 2 ,& 2 ), then 4- g 2 is affine. We may assume without loss of 
generality that a\ — a 2 and b\ = ft 2 . Function ft (24 y) has the form /i(x)4-(ai • 
?/)(^r2/)+ c r2/+e+(^-2/+ r /)(/i+/2)(^), where d = ci+c 2 and 77 = ei+e 2 . Then 
D(o,b)h{x,y) has the form e-y4-(ai -6) (61 *6)4-Ci -6+(d*6)(/i + / 2 )(z)i where 
e = (61 • b) ai + (ai • 5) &i. Thus, for every b G {ai, 61, d} -1 , the vector (0, 6) is 
a linear structure for ft, and ft admits therefore nonzero linear structures, if 
s > 4. 

• If (0, . . . , 0, 1) belongs to (ai, 61) and to (a 2 , & 2 ), then we may assume without 
loss of generality that iq = & 2 = (0, . . . , 0, 1). We still have to find ai, ci, a 2 
and c 2 , whose last coordinates can be taken, without loss of generality, equal 
to 0, and such that ci,ci 4- ai, c 2 and c 2 4- a 2 are distinct and have weights 
at least s — 2. The set of vectors of length s, with last coordinates equal to 
0, and with weights at least 5 — 2 has size Q”*) 4- (*li) = s. Thus we need 
s > 4. The choice of any two disjoint pairs of vectors (that is, of disjoint lines) 
{ci, Ci 4-ai} and {c 2 , c 2 + a 2 } in this set leads then to a pair of functions with 
the desired properties. 

If s = 4, then such choice is a partition and all choices lead in fact to the same pair 
of functions gi and y 2 , up to permutation of the coordinates. Indeed, in such a 
partition, one line (and one only) must contain the vector (1, 1, 1,0) and the other 
vector of the same line is one of the 3 vectors of weight 2 covered by this vector 
of weight 3. So, the construction due to Tarannikov et al. is the only possible one, 
up to permutation of the coordinates. 

Remark 6.1. Let us see, out of curiosity, what partition corresponds to the 
(4, 1,2,4) functions in the construction of Tarannikov et al.: we have gi(x) = 
X 1 X 4 4- X 2 X 4 + x 2 + £3 = (ai • x)(bi ■ x) 4- ci • x and y 2 (^) — ^3^4 4- x\ 4- x 2 + £3 = 
(a 2 • x)(b 2 • x) + c 2 • x, with a\ = (1, 1,0,0), &i = (0,0,0, l),ci = (0, 1, 1,0), a 2 = 
(0,0, 1,0), & 2 = (0,0,0, 1) and c 2 = (1,1, 1,0). Thus the disjoint lines are {ci,ci + 
ai} = {(0,1, 1,0), (1,0, 1,0)} and {c 2 ,c 2 + a 2 } = {(1, 1, 1, 0), (1, 1, 0, 0)}. Note that 
g\ and y 2 belong to Maiorana-McFarland’s class, since gi(x) = (x \ , x 2 , £3) • (£4, 1 4- 
x 4 , 1) and y 2 (x) = (aq, x 2 , x 3 ) • (1, 1, 14-X4). The mappings : x 4 >— > (x 4 , l+x 4 , 1) 
and : x 4 1— > (1, 1, 1 4- x 4 ) are injective, all the elements of their images have 
weights at least 2, and 0i(F 2 ) and 0 2 (F 2 ) are disjoint, so that the Walsh supports 




Constructions of Resilient and Bent Functions 



21 



of gi and g 2 are disjoint. According to Proposition 5.2, if d°(fi + f 2 ) = d°fi and if 
fi and f 2 have not a same nonzero linear structure, then h has no nonzero linear 
structure, because for every 6^0, the function g\ + g 2 + D^gi is non-constant 
and there does not exist 6^0 such that D^gi and Dig 2 are constant and equal 
each other (gi and g 2 share here the nonzero linear structure b = (1, 1,0,0), but, 
for this value of 6, the derivative D^gi equals 1 and is different from the derivative 
Dbg 2 , which equals 0). o 



If s > 5, then we have, up to permutation of the coordinates, two possible choices 
of the lines {ci,ci + ai} and {c 2 ,c 2 + a 2 }: one in which the word (1, . . . , 1,0) 
appears in one of the lines, and one in which it does not. The first choice gives, up 
to permutation of the coordinates: 



c i + (ai,bi) 

= {(o,i, i,o), (i,o, i,o), (o,i, 1,1), (i,o, 1,1, 

c 2 + («2) &2} 



= {(1,1, 1,1,-.. *1,0), (1,1, 0,1, •••1,0), (1,1, 1,1,..., 1,1), (1,1, 0,1,... 1,1)}; 



which leads to the following functions: 51 (z) = (a:* + x 2 )x s + 9i( x ) = 

%3 x s + Xi and thus to the secondary construction: h(x,y) = fi(x) + (yi + 

2/2)2/* + YliZ 2 Vi + (/1 + h){x){{y\ + 2/2 + 2/3)2/* + 2/1). The second choice gives, up 
to permutation of the coordinates: 



ci + (^1,61) 



= {( 0 , 1 , 1 , 1 ,... , 1 , 0 ), ( 1 , 0 , 1 , 1 , ... 1 , 0 ), ( 0 , 1 , 1 , 1 ,..., 

C 2 + (a2,b 2 ) 

= {( 1 , 1 , 1 , 0 , 1 ,. .., 1 , 0 ), ( 1 , 1 , 0 , 1 , ... 1 , 0 ), ( 1 , 1 , 1 , 0,1 



1 , 1 ), ( 1 , 0 , 1 , 1 , ... 1 , 1 )}; 
,..., 1 , 1 ), ( 1 , 1 , 0 ,!,. .. 1 , 1 )}; 



which leads to the following functions: g\{x) = (x\ + x 2 )x s + 2 x i, 92(2) = 
(#3 + x 4 )x s + J2i=i x i + ^3 and thus to the secondary construction: h(x,y ) = 

h( x ) + ( yi + V2 )y s + Ei=2 Vi + (/1 + f2)(x)((yi + 2/2 + 2/3 + V 4 )y 8 + yi + 2/3). 

In both cases, from two (r, £, r — t — 1, 2 r_1 — 2* +1 ) functions /i and f 2 with 
disjoint Walsh supports and such that f\ + f 2 has degree r-t - 1, we construct an 
(r-Ps,£ + s + l,r + s — t — m — 2, 2 r+ S_1 - 2 t+s+2 ) function h, and this construction 
has exactly the same nice properties as Tarannikov et al.’s construction, except 
that h has the nonzero linear structures (0, b) with b\ = b 2 = 63 = b s = 0 and 
64 = 1 in the first case; b\ = b 2 = 63 = 64 = 1 and b s = 0 in the second one). 



6.3. The Case m = s — 4 

We wish now to obtain two (s, s — 4, 3, 2 s-1 — 2 s-3 ) functions g\ and g 2 whose sum 
has degree 3 and with disjoint Walsh spectra. Note that, according to Remark 4.1, 
the pair obtained at Subsection 6.2 of (s,s - 3,2,2 s-1 - 2 s-2 ) functions (s > 4) 
whose sum has degree 2 and with disjoint Walsh spectra, gives an (s T l,s — 
3, 3, 2 s — 2 s-2 ) function. So, applying this observation to s — 1 > 4 gives an ( s , s — 




22 



C. Car let 



4. 3, 2 s-1 — 2 s-3 ) function for every 5 > 5. Such function can also easily be obtained 

by Maiorana-McFarland’s construction, as observed in [28]. But we seek a pair of 
such functions, we want their sum to have degree 3 and their Walsh spectra to 
be disjoint. According to Proposition 5.1, from every pair of functions obtained at 
Subsection 6.1, and every pair obtained at Subsection 6.2, we can construct the 
desired pair by using our construction, if the union of the supports of the Walsh 
transforms of the functions in one of the pairs is invariant under the mapping 
V ^ V + (0, • • • ,0, 1). Also, according to Remark 4.1, seeking the desired pair is 
equivalent to seek an (s + 1, s — 4, 4, 2 s — 2 s-3 ) function, that is, denoting 5 + 1 by s', 
an (s', s' — 5, 4, 2 S _1 — 2 s -4 ) function. The universal bound and the fact that bent 
functions are not balanced implies that s' — 4 > - 1, that is, s' > 6, which means 

that we can hope obtaining such pair for s > 6 only. A (7,2,4, 56) function with 
no nonzero linear structure has been exhibited in [24]. The construction “adding 
a variable” permits then to obtain (s', s' - 5, 4, 2 s -1 - 2 s -4 ) functions for every 
s' > 7 and to deduce, for every s > 6, pairs (# 1 ,( 72 ) of (s, s — 4,3,2 s-1 — 2 s-3 ) 
functions g\ and #2 whose sum has degree 3, and with disjoint Walsh spectra. 

6.4. The Cases m < s — 5 

Here again, according to Proposition 5.1, from every desired pair of s - variable 
m-resilient functions such that m = s — k, and every desired pair of s'- variable 
m'-resilient functions such that m' — s' - k, we can construct a desired pair of 
s"- variable m"-resilient functions such that m" = s" — k — k' + 1 , if the union of the 
supports of the Walsh transforms of the functions in one of the pairs is invariant 
under the mapping y 1 — > y + (0 , . . . , 0, 1). 

Also, for every positive integer k and for every s > k — 2 + 2 /c-2 , there exist 
(s, s — k, k — 1, 2 s-1 - 2 s-/c+1 ) functions. Indeed, set S 2 = k - 2, and for every s, set 
si = s — S 2 ; there exists an injective mapping <f> : F£ 2 {y G F % 1 ; wt(y) > s— k+l} 
if and only if YliL s -k + 1 (7) = s ~k+2 > 2 S2 . We deduce the existence of an ( s — k )- 
resilient Maiorana-McFarland’s function for every s such that s — k + 2 > 2 fc-2 . 

Consequently, according to Remark 4.1, for every positive integer k and for 
every s > k - 3 + 2 k ~ 2 , there exist pairs of (s, 5 — k + 1, k - 2, 2 s-1 — 2 s-/c+2 ) 
functions whose sum has degree k — 2 and with disjoint Walsh spectra. 

Remark 6.2. Another example of pair of functions (# 1 , # 2 ) achieving Siegent haler’s 
and Sarkar et al.’s bounds, such that #1 + #2 has same degree as #1 and #2 and 
with disjoint Walsh spectra can be found in the literature: it is a pair of (7, 1, 5, 52) 
functions coming from a (8, 1,6, 116) function in [19]. This pair is not of the same 
kind as the others, because the resiliency order 1 is upper bounded by | — 2, where 
s is the number of variables. 




Constructions of Resilient and Bent Functions 



23 



7. Generalizations of the Construction 



7.1. For Boolean Functions 

The construction of Theorem 5.1 can be generalized into 
k k 

Kx 1 ,.. .,x k )- fJ(/{ + f 2 ){x l ) + yi/jpE*), where x l € F 2 n ,Vi = 1, . . . ,k. 

i= 1 z=l 



It is a simple matter to see that the Walsh transform of the function y ^_ 1 

is the function (a\...,a k ) i-+ nEi fi(a')- Since the product n*=i(/i + 
equals 1 if and only if (/f H- f 2 )(x l ) equals 1 for every z, we deduce: 



^ ^ ^ l)/i(0+s ‘ a 

Z=1 \xieF^/fi( X i)+fi( X i)=: 1 
k 



=n f K a )- 2 n 

i= 1 i = 1 

-n3c‘)-*n( E 

* =1 i=1 Xx^F^ V 

n^)-^ e (-D ,j| nS(« 4 ) n f V)- 



i= 1 



7C{l,...,fc} 



iel ie{l,...,fc}\7 



7.2. For Vectorial Functions 

Our construction can be also generalized to vectorial functions. Let F± and F 2 be 
mappings from F 2 to a field F 2 k , and let G\ and G 2 be mappings from F 2 to F 2 k . 
Define H(x,y) = Fi(x) 4- G\(y) + (F\ + F 2 )(x) x (G\ + G 2 X 2 /), where “x” is the 
multiplication in the field F 2 k. Recall that, in the case of a vectorial function H 
valued in such a field, the Walsh transform is applied to the Boolean functions 
tr(aH ), where tr is the trace function from F 2 k to F 2 , and where a is any element 
of F 2 k . The value at (a, b) e F 2 x F 2 of the Walsh transform of tr(aH) equals: 



E 

/5€F 2fc 



( x)+atG\{y)+ot(3{Fi+F2){x))+a-x+b y _ 

y^ F V xGF 2 r 

( G 1 + G 2 ) (y) = 0 




E EE (- 1 ) <r[ ' 

/3€F 2fc 7 €F 2fc yeF£x£F£ 



F 1 (x)+aG 1 (y)+ap(F 1 -\-F 2 )(x)+'y((G 1 +G 2 )(y)+(3)]+a-x+b-y 



4E E (-i) tr[7/?1 ( E (-i) tr[aGi 



/3eF 2fc 7 GF 2 fc 



(y)+ 7 (G ? i+G 2 )( 2 /)]+ 6 -y 

v -/ 

\y£Fi 

'y ^ ^_ 2 ^^[»F i (x)+Q!/ 3 (Fi+F 2 )(a:)]+a-a: 
,xeF- 




24 



C. Carlet 




8. A Secondary Construction of Bent Functions Revisited 

We consider now the same secondary construction as in the previous sections, but 
applied to bent functions instead of resilient functions. Recall that any n - variable 
bent function / (n even), admits a dual / defined by f (u) — 2 n ^ 2 (—l)^ u \ Vi/ E Ff. 
Let f\ and / 2 be two r - variable bent functions (r even) and let g-^ and g 2 be two 
s - variable bent functions ( s even). Let us denote their duals by /i,/ 2 ,< 7 i and < 72 - 
Define again h{pc,y) = fi{x) F gi(y) + (/1 H- f 2 )(x) ( gi F ^ 2 ) (2/), £ E FJ,y E Ff. 
According to Relation (5.2), we have 

h(a, 5) 

— 2 r ^~ 1 ( — l)A( a )+^i( 6 ) -f (_l)/i( a )+^2(b) _|_ ^_^/2(a)+gi(6) _ ^_^/2( a )+^2(b) 

= 2 ( a ) + 9l (^) + (/l+/2)( a ))(pl+P2)(^)) _ 

This last equality can be easily checked in each of the 16 cases corresponding to 
the 16 possible values of (/i(a), /2(a), < 71 ( 6 ), < 72 ( 6 ))- We see that h is then bent and 
that its dual h can be obtained from fi, f 2 , <7i and <72 exactly in the same manner 
as h can be obtained from / 1 , / 2 , g\ and g 2 . Note that this construction generalizes 
the classical construction f(x) F g(y) indicated by J. Dillon. But it has also the 
interest of generating functions which are not “decomposable”. Moreover, if fi F/2 
has maximum possible degree ~ and if g\ F g 2 has maximum possible degree § , 
then h has maximum possible degree 

This secondary construction leads to new bent functions. For instance, if 
we take / 1 , f 2 , g\ and g 2 in Maior ana- McFarland’s class, we see that for every 
permutations 7 Ti and n 2 on F 2 ^ 2 , for every permutations tt[ and 7r 2 on F^ 2 , for 
every r/2-variable Boolean functions h\ and h 2 and for every s/2 - variable Boolean 
functions h[ and h 2 , the function (x,y,x',y f ) E F 2 2 x F 2 2 x F '2 2 x F '2 2 1 — > 
x • 7Ti (y) F hi(y) F x' • tt[ ( y') F h[(y') F (x • (y) 4- hi(y) + x- ir 2 (y) F h 2 (y))(x' • 

ir[ (y f ) F h[(y f ) F x' • n 2 (y f ) F h f 2 (y')) is bent. The function x • ni(y) F h\(y) F x' • 
7r i(2/ / ) 4- h'i(y') belongs to Maiorana-McFarland’s class, but not the global function 
above, in general. 

Remark 8.1. Thus, if f\ and f 2 satisfy PC{r) and if g\ and g 2 satisfy PC(s ), then 
h satisfies PC(r F s). Note that if f\ and f 2 satisfy only PC(£) with £ < r or if 




Constructions of Resilient and Bent Functions 



25 



g\ and g 2 satisfy only PC(£') with £' < s then we lose most of the strength of 
this result: take for instance f\ = / 2 , then h(x,y) = fi(x) + gi(y)', the derivative 
D(a,b){ x ,y) = D a fi(x) + Dbgi (y) is balanced if and only if D a f\ is balanced or if 
D^gi is balanced; thus, if fi and / 2 satisfy PC(£) and if g\ and g 2 satisfy PC {£ '), 
then h may satisfy only PC(min(£,£')). 

Remark 8 . 2 . The construction (/1 , / 2 , gi , #2) ► h is a particular case of the general 

secondary construction given in [3], that we describe now: let h be a Boolean 
function on F 2 +s = F 2 x F| such that, for any element y of F|, the function 
on F 2 r : 

h y : x — > h(x,y) 

is bent. Then h is bent if and only if, for any element u of F 2 , the function 

Vu'-y -> hy{u) 

is bent on F 2 . If this condition is satisfied, then the dual of h is the function 
h(u,v) = 

Here, for every y , h y equals /1 plus a constant or / 2 plus a constant (depend- 
ing on the values of y) and thus is bent; and <p u equals g\ plus a constant or g 2 
plus a constant (depending on the values of u), and thus is bent too. 

What is interesting in the particular case of this construction (/1 , /2, gi , #2) •— 5 ► 
h is that it only assumes the bentness of /1, / 2 , gi, and g 2 for deducing the bentness 
of h; no extra condition is needed, contrary to the general construction recalled 
above. 



9. Conclusion 

We have given a general framework for the best known secondary construction of 
resilient functions achieving Siegent haler’s and Sarkar et al.’s bounds, and avoiding 
nonzero linear structures. This has led us to a generalization of this construction, 
leading to an infinite multiple branching tree (instead of an infinite sequence) 
of such functions. The original secondary construction permitted to build few 
functions achieving Siegent haler’s and Sarkar et al.’s bounds for any number of 
variables n and any resiliency order t such that t > and t > f — 2. Our 

generalization permits to build many more such functions with the same number 
of variables and resiliency order. It also permits to design vectorial functions and 
bent functions. 




26 



C. Car let 



References 

[1] P. Camion, C. Carlet, P. Charpin, N. Sendrier. On correlation-immune functions. Ad- 
vances in Cryptology: Crypto ’91, Proceedings, Lecture Notes in Computer Science , 
V. 576, pp. 86-100, 1991. 

[2] A. Canteaut and M. Trabbia. Improved fast correlation attacks using parity-check 
equations of weight 4 and 5, Advanced in Cryptology- EURO CRYPT 2000. Lecture 
Notes in Computer Science 1807, pp. 573-588, 2000. 

[3] C. Carlet. A construction of bent functions. Finite Fields and Applications, London 
Mathematical Society Lecture Series 233, Cambridge University Press, pp. 47-58, 
1996. 

[4] C. Carlet. More correlation-immune and resilient functions over Galois fields and 
Galois rings. Advances in Cryptology, EUROCRYPT’ 97, Lecture Notes in Computer 
Science 1233, Springer Verlag, pp. 422-433, 1997. 

[5] C. Carlet. On the coset weight divisibility and nonlinearity of resilient and 
correlation-immune functions. Proceedings of SETA ’01 (Sequences and their Appli- 
cations 2001), Discrete Mathematics and Theoretical Computer Science, Springer, 
pp. 131-144, 2001. 

[6] C. Carlet. A larger class of cryptographic Boolean functions via a study of the 
Maiorana-McFarland construction. Advances in Cryptology - CRYPTO 2002, Lec- 
ture Notes in Computer Science 2442, pp. 549-564, 2002. 

[7] C. Carlet and E. Prouff. On plateaued functions and their constructions. Proceed- 
ings of Fast Software Encryption 2003, Advances in Cryptology, Lecture Notes in 
Computer Science 2887, pp. 54-73, Springer 2003. 

[8] C. Carlet and P. Sarkar. Spectral domain analysis of correlation immune and resilient 
Boolean functions. Finite Fields and Applications 8, pp. 120-130, 2002. 

[9] F. Chabaud and S. Vaudenay (1995). Links between differential and linear cryptanal- 
ysis. EUROCRYPT’94, Advances in Cryptology, Lecture Notes in Computer Science 
950, Springer Verlag, 356-365. 

[10] S. Chee, S. Lee, K. Kim and D. Kim. Correlation immune functions with controlable 
nonlinearity. ETRI Journal , vol 19, no 4, pp. 389-401, 1997. 

[11] S. Chee, S. Lee, D. Lee and S.H. Sung. On the correlation immune functions and 
their nonlinearity. Proceedings of Asiacrypt’ 96, LNCS 1163, pp. 232-243. 

[12] J.F. Dillon. Elementary Hadamard Difference sets. Ph. D. Thesis, Univ. of Maryland, 
1974. 

[13] J.H. Evertse. Linear structures in block ciphers. In Advances in Cryptology - EU- 
ROCRYPT’ 87, no. 304 in Lecture Notes in Computer Science, Springer Verlag, pp. 
249-266, 1988. 

[14] T. Jakobsen and L.R. Knudsen. The interpolation attack on block ciphers. Fast 
Software Encryption’ 97, Lecture Notes in Computer Science 1267, pp. 28-40, 1997. 

[15] L.R. Knudsen. Truncated and higher order differentials. Fast Software Encryption, 
Second International Workshop, Lecture Notes in Computer Science, n 1008. pp. 
196-211. - Springer Verlag, 1995. 




Constructions of Resilient and Bent Functions 



27 



[16] X. Lai. Higher order derivatives and differential cryptanalysis. Proc. Symposium on 
Communication, Coding and Cryptography , in honor of J.L. Massey on the occasion 
of his 60’th birthday. R. Blahut, editor. Kluwer Academic Publishers, 1994. 

[17] S. Leveiller, G. Zemor, P. Guillot and J. Boutros. A new cryptanalytic attack for 
PN-generators filtered by a Boolean function. Proceedings of Selected Areas of Cryp- 
tography 2002 , LNCS 2595, pp. 232-249 (2003). 

[18] F.J. MacWilliams and N.J. Sloane. The Theory of Error- Correcting Codes , Amster- 
dam, North Holland, 1977. 

[19] S. Maitra and E. Pasalic. Further constructions of resilient Boolean functions with 
very high nonlinearity. IEEE Transactions on Information Theory , vol. 48 (7), pp. 
1825-1834, 2002. 

[20] S. Maitra and P. Sarkar. Modifications of Patterson- Wiedemann functions for cryp- 
tographic applications. IEEE Trans. Inform. Theory , Vol. 48, pp. 278-284, 2002. 

[21] M. Matsui. Linear cryptanalysis method for DES cipher. Advances in Cryptology 
- EUROCRYPT’93, number 765 in Lecture Notes in Computer Science. Springer 
Verlag, pp. 386-397, 1994. 

[22] N.J. Patterson and D.H. Wiedemann. The covering radius of the [2 15 , 16] Reed- 
Muller code is at least 16276. IEEE Trans. Inform. Theory , IT-29, pp. 354-356, 
1983. 

[23] N.J. Patterson and D.H. Wiedemann. Correction to [22]. IEEE Trans. Inform. The- 
ory, IT-36(2), pp. 443, 1990. 

[24] E. Pasalic, S. Maitra, T. Johansson and P. Sarkar. New constructions of resilient 
functions and correlation immune Boolean functions achieving upper bound on non- 
linearity. Proceedings of the Workshop on Coding and Cryptography 2001, pp. 425- 
434, 2001. 

[25] O.S. Rothaus. On bent functions. J. Comb. Theory , 20A, 300-305, 1976. 

[26] R.A. Rueppel. Analysis and Design of Stream Ciphers , Com. and Contr. Eng. Series, 
Springer, Berlin, 1986. 

[27] P. Sarkar and S. Maitra. Construction of nonlinear Boolean functions with important 
cryptographic properties. Advances in Cryptology - EUROCRYPT 2000 , number 
1807 in Lecture Notes in Computer Science, pp. 485-506. Springer Verlag, 2000. 

[28] P. Sarkar and S. Maitra. Nonlinearity bounds and constructions of resilient Boolean 
functions. CRYPTO 2000, LNCS Vol. 1880, ed. Mihir Bellare, pp. 515-532, 2000. 

[29] T. Siegenthaler. Correlation-immunity of nonlinear combining functions for crypto- 
graphic applications. IEEE Transactions on Information theory , V. IT-30, No. 5, pp. 
776-780, 1984. 

[30] T. Siegenthaler. Decrypting a class of stream ciphers using ciphertext only. IEEE 
Transactions on Computer, V. C-34 , No. 1, pp. 81-85, 1985. 

[31] Y.V. Tarannikov. On resilient Boolean functions with maximum possible nonlinear- 
ity. Proceedings of INDOCRYPT 2000, Lecture Notes in Computer Science 1977, pp. 
19-30, 2000. 

[32] Y.V. Tarannikov. New constructions of resilient Boolean functions with maximum 
nonlinearity. Proceedings of FSE 2001, 8th International Workshop, FSE 2001, Lec- 
ture Notes in Computer Science, vol. 2355, pp. 66-77, 2001. 




28 



C. Car let 



[33] G.-Z. Xiao and J.L. Massey. A spectral characterization of correlation- immune com- 
bining functions. IEEE Trans . Inf. Theory , Vol IT 34, n° 3, pp. 569-571, 1988. 

[34] G.-Z. Xiao, C. Ding and W. Shan. The Stability Theory of Stream Ciphers, vol. 
LNCS 561, Springer Verlag, 1991. 

[35] Y. Zheng, X.-M. Zhang. Improving upper bound on the nonlinearity of high order 
correlation immune functions. Proceedings of Selected Areas in Cryptography 2000 , 
Lecture Notes in Computer Science 2012, pp. 262-274, 2001. 



Claude Carlet 
INRIA 

Projet CODES 
BP 105 

F-78153 Le Chesnay Cedex, France 
also with 

GREYC (Caen) and the University of Paris 8 
e-mail: claude . carletOinria . f r 




Progress in Computer Science and Applied Logic, Vol. 23, 29-50 
© 2004 Birkhauser Verlag Basel /Switzerland 



Adaptive Recursive MLD Algorithm Based on 
Parallel Concatenation Decomposition for 
Binary Linear Codes 

Tadao Kasami 



Abstract. Based on the original recursive MLD algorithm (RMLD), “top- 
down RMLD” has been proposed to reduce the average decoding complexity 
by a lazy evaluation strategy. In this paper, a revised version of top-down 
RMLD, called adaptive RMLD, is surveyed. In the adaptive RMLD, the coars- 
est parallel concatenation decomposition is adopted as the basis of recursion, 
and a new sufficient condition that a currently best candidate is the optimum 
at the current level of recursion is used as an early termination condition of 
the recursion. Preliminary simulation results for the (128, 64) Reed-Muller 
code are presented. 

Keywords. MLD, Recursive, Adaptive, Parallel concatenation decomposition. 
This paper is partially based on [11]. 



1. Introduction 

Two types of maximum likelihood decoding algorithms for linear block codes have 
been proposed. The decoding complexity of the first type (for example, Viterbi type 
algorithms) is almost independent of the signal to noise ratio, and the complexity 
of the second type (for example, iterative decoding algorithms [1, 2]) decreases 
considerably as the signal to noise ratio increases. Based on the original recur- 
sive maximum likelihood decoding algorithm [3] of the first type, abbreviated as 
“Bottom- up RMLD” , recursive maximum likelihood decoding algorithm of the sec- 
ond type, called “Top-down RMLD [4]”, has been introduced by a lazy evaluation 
strategy. In contrast with Viterbi type algorithms, RMLD has a parallel structure 
and bottom-up RMLD provides a reduced decoding complexity compared with 
an optimally sectionalized Viterbi algorithm by A. Lafourcade and A. Vardy [5]. 
Furthermore, RMLD has a homogeneous structure for binary-transitive-invariant 
codes [6], e.g., RM codes and EBCH (extended primitive BCH) codes. 




30 



Tadao Kasami 



In this paper, a revised version of top-down RMLD, called “Adaptive RMLD” 
is introduced. In Adaptive RMLD, the coarsest parallel concatenation decompo- 
sition is adopted as the basis of recursion, and a new sufficient condition that a 
currently best candidate is the optimum at the current level of recursion is used 
as an early termination condition of the recursion. Preliminary simulation results 
(Figures 6 and 7) [12] on the (128, 64, 16) RM code show that the numbers of ad- 
dition equivalent operations (AEO) are reduced to about 10 -2 for l.OdB to 3.0dB 
and about 10“ 1 for 4.0dB of those by top-down RMLD [7, 8] which used ordered 
statistic information [9]. 

2. Notations 

For i < j , [i, j] denotes the set of integers from i to j, called a section. For a positive 
integer n, V n denotes the set of binary n-tuples. For u = (u\, U 2 , . . . , u n ) G V n and 
a subset I = {i\, i 2 , . . . , 2 m } of [l,n], p\u = (i^,^, . . . , Ui m ). For two subsets I 
and J of [l,n], pj(piu) = pmju • For U C V n ,piU = {piu : u € U} and 
sjU = pi{u G U : sup (u) C /}, where sup(u) denotes the support of u. For 
j = {ji,j2 ,---J g } Q /, u G pjV 71 and U C piV n , define pju = (u h ,u h ,.. .,u jg ) 
and pjU = {pjtx : w G (/}. For a matrix M with n columns, pjM denotes the 
submatrix of M consisting of the iith, the 22th, . . . , the 2 m th columns in this 
order. 

We assume that a binary (TV, K) linear block code C is used over an AWGN 
channel with BPSK signaling and each codeword is equally transmitted. For a 
received sequence r = (n, 7*2 , . . . , rjv) 1 , let z = (z\ , 2:2 , . . . , zn) denote the binary 
hard-decision sequence for r. For I C [1, TV] and u G PiV N , define 

L;(n) = £ |r»|. (2.1) 

{iG/ : Ui^Zi} 

Li(u) (or simply L(u)) is called the correlation discrepancy of u. For a nonempty 
subset U of piV N , define L[{7] = min n€ t/L(u) and for u G 17 such that L(u) = 
L[[7], we write u — v[U] and call it the best 2 (or the most likely) in U. Define 
m = 00 for the empty set 0. For the most likely codeword c M l of C, c M l = v[C]. 
For a family T of subsets of piV N and 1 < i < |T|, let vt( 0 denote the zth best 
in {v[D} : D G T}. 

For a binary linear block code Ai and its linear subcode A 2 , let A 1 /A 2 denote 
the set of cosets of A 2 in A\. For a linear subcode A3 of A 2 , a coset B G A 1 /A 2 
can be represented as a union of IA2/A3I cosets in A 1 /A 3 . Define B/A 3 = {D G 
A 1 /A 3 : D C B}. In this paper, a coset D in a set T of cosets is denoted by its 
id-number. For D G T, a unique binary sequence, denoted idr(D ), is assigned. 
Let I dr denote {idr(D) : D G T}. For u G D, define idr(u) = idr{D). For u 
in a coset in T, let T(u) or T(idr(u)) denote the coset which contains u. For 

1 For simplicity, unquantized case will be considered. 

2 For u 7^ u' in U, the probability that L(u ) = L(u') is zero. 




Adaptive Recursive MLD Algorithm 



31 



T — A 1 /A 2 , we choose a linear code, denoted [T], with dimension m = log 2 \T\ as 
the set of coset leaders of A\/A 2 . Let {g\,g 2 , • • • i9m} a chosen basis of [T]. 

m 

For a coset D eT whose coset leader is ^ aig { with a* G {0, 1}, 

i=i 

id T {D) = (ai,a 2 ,...,a m ). (2.2) 

For id G Ma 1 /a 2 , let l 1 A 1 /A 2 (id) or simply p{id) denote the coset leader of a coset 
whose id-number is id. That is, 



(Ai/A 2 ){id) — [iAx/A 2 (id) + A 2 . (2.3) 

In case that T = D/As with D G A\/A 2 and id\ = id Al /A 2 {D ), for E G D/As , 
there is a unique id 2 G Id A2 /A 3 such that 



E — n Al /A 2 {idi) + HA 2 /A 3 (id 2 ) H- As. (2.4) 

Consequently, we define 

id D /A 3 (E) = idi o id 2 , (2.5) 

Md/a 3 (^i 0 id 2 ) = VA 1 /A 2 (idi) + HA 2 /A 3 (id 2 ), (2.6) 

where infix “o” denotes the concatenation operation. We use the following nota- 
tions. For a = /? 07 g {0,1}*, )9\a = 7 and a /7 = For /3 , 7 G {0,1}* and 
A C {0, 1}*, (3\A = {(3\a : a G A} and A/ 7 = {a /7 : a G A}. From the definition 
of I dr and (2.3) to (2.6), 

IdA 2 /A 3 

(D/As)(id) 



= idi\Id D /A 3 , (2.7) 

= Vd/a 3 {id) + As, for id G Id D/A3 • (2.8) 



3. Parallel Concatenation Decomposition of Coset Sets 

In the original, top-down or adaptive RMLD, how to divide into sub-problems 
is specified by a binary sectionalization where each section is labelled I a with 
index a G {0, 1}*. The length of a, denoted |a|, is called the level of section I a . 
I\ denotes [1,1V]. I a with \I a \ > 2, called a nonleaf section, is partitioned into 
I a 0 and I a 1 . A binary sectionalization can be represented by a binary tree like 
Fig. 1. For code length N = 2 m , a sectionalization such that \I a \ = 2 m- l a l with 
0 < \a\ < m is called the uniform binary sectionalization. We abbreviate the suffix 
I a as a, e.g., pi a as p a and s/ Q as s a . For U C V Ia and (3 G {0, 1}*, p a pU and 
SapU are abbreviated as ppU and spU, respectively. 




32 



Tadao Kasami 



Level 0 I\ 




Figure 1: The uniform binary section tree with N = 2 3 . 

RMLD is based on decomposition techniques. For a nonleaf section I a , let A 
and B be a binary linear code and its linear subcode over / a , respectively. Since 
sqB o s\B is a linear subcode of A, we have that 



Parallel Concatenation Decomposition: A can be decomposed as 

A= |J p 0 Do Pl D. (3.1) 

D(EA/(sq Bos\ B) 



A 

For u and u' in D G A/ ( SqB o SiB), p^u + p^u' G s^B with b G {0, 1}, that is, 



PhD G PbA/sbB, (3.2) 

and if =■ st,B for b G {0, 1}, then there is a one-to-one correspondence between 
PoA/soB and p\A/s\B. For D G A/(s$B o s\B), p^D G p^A/stB and the coset 
PlD in piA/siB , where 0 = 1 and I A 0, (pt>D,piD) is called an adjacent pair, 
and piD is called the counter part of PhD. 



Concatenation Lemma: Let E ^ be a linear code over I ^ with b G {0, 1}. Then, 



Eq o E\ C B <=> Ei C siB. 
(3.3) follows from 0 G Ei for b G {0, 1}. 



(3.3) 

A 




Adaptive Recursive MLD Algorithm 



33 



Prom the above lemma, the coarsest decomposition of type (3.1) is obtained 
by choosing B as A itself. 

Coarsest Parallel Concatenation Decomposition: 

A= (J p 0 Dop 1 D. (3.4) 

D£.A/ (soAosiA) 

Any decomposition of type (3.1) is a refinement of the decomposition (3.4). Each 
coset in A/ (s 0 A o siA) consists of |soA/soB| • |siA/si£?| cosets of A/(sqB o siB). 
We consider Aj (soA o s\ A) in (3.4). We can choose a generator matrix G of A of 
the following form: 

Go i 0 

G= _0_ 1 Gi_ , (3.5) 

Go,f 

where 0 denotes a zero matrix, Go and G\ are generator matrices of soA and 
siA, respectively, and Go,i is a generator matrix of the set of coset leaders of 
A/(s 0 AosiA). For a basis = {<h,0 2 , . . . ,<j m } of [A/(s 0 A o siA)], p b B a with 
b G {0, 1} is linearly independent. Otherwise, there is a linear sum u of rows of 
Go,i such that pqu(ot p\u) = 0 and p\u{or pou) •=/=■ 0. Then, piu(or pou) is in 
siA(or so A), a contradiction. 

We choose p b B a as a basis of \p b A/s b A]. Then, for D e A/(soA o siA) and 

m 

^2 a i9i ^ A if follows from (2.2) that for b e {0, 1}, 

i=l 

/ id'A/(soAosiA)(D) idphA/ s b A(p b D) = & ( 3 . 6 ) 

and 

((poA/soA)(id), (piA/siA)(id)) is an adjacent pair, 

and they are unique counter parts to each each. (3.7) 

This simplifies the specification of the adjacent coset. Eq.(3.4) can be rewritten as 
A= (J (p 0 A/s 0 A)(id) o (piA/s 1 A)(id), (3.8) 

ideld A /( SQAosi A ) 

where Id A/{soAoslA) = Id PbA/sbA with b e {0, 1}. 

Let A! be a linear supercode of A. For a coset D e A! I A, the following is a 
corollary of (3.8). Define id D = id A , /A (D), id PoD = id poAWpoA (p 0 D) and id PlD = 
id Pl A'/ Pl A(PiD). Since (p 0 A/s 0 A)(id) = p Po A/s 0 A(id) + s 0 A, (p x A/ siA)(id) = 
PpiA/s lA {id) + si A and D = p A > / A (ido) + A , it follows from (3.8) that 

D = U (pom A' /a (* d D ) + /ipo A / So A (id) + So A) o 

id€.Id pQA / S Q A =Id piA / SlA 



(pit*A>/A(id D ) + Mpi A/ Sl A(id) + s x A). 




34 



Tadao Kasami 



Since p 0 D - PoP-a'/a^d) + PoA = Pp 0 A'/p 0 A(id P0 D ) + PoA, PoPA'/A^dp) = 
PpoA 1 /poA(idp 0 i)). From (2.4), p Po A'/p 0 A{^d Po D ) H - P Po A/soAi}d ) +so^4 
(p 0 D/s 0 A)(id PoD o id). Consequently, we have (3.9): 

D= (J (p 0 D/s 0 A)(id PoD oid) o (p x D/ s x A)(id PlD o id), (3.9) 

id£l d,A / ( sqAo si A) —Idp^A / s fr A 

where ((poD/soA)(id Po D oid),(p\D j s\A)(id Pl D oid)) is called an adjacent pair 
in D. 



4. Recursive Maximum Likelihood Decoding 

We briefly review recursive maximum likelihood decoding (RMLD) [3] and intro- 
duce new versions of RMLD, called top-down RMLD [4, 7, 8] and the revised 
version [11, 12], called adaptive RMLD, based on a “call by need” approach. For 
an index a € {0, 1}*, define T a = p a C/s a C. 

Local Optimum [3]: For a coset D in T a , there is a codeword u G C such that 
p a u = v[D\. Then, for any codeword u r G C such that p a u' G D, 

L a (u) < L a (u f ). (4.1) 

A 

The sub-codeword v[D\ is called the most likely local (MLL) sub-codeword in D 
or an MLL sub-codeword in T a . Since p\C/s\C = {C}, an MLL sub-codeword in 
T\ is the most likely codeword c M l- 

Let I a be a nonleaf section. From the above lemma, the recursive search 
procedure for c M l in the original RMLD is based on the following decomposi- 
tion, called MLL decomposition, which can be derived from (3.1) and (3.2) by 
substituting p a C for A and s a C for B : 

PaC = (J poDopxD, (4.2) 

DePT a 

where 

PT a ±p a C/{s a0 Cos al C). (4.3) 

This type of decomposition is the same as the one which holds for the sets of label 
sequences between adjacent subsections of a trellis diagram [10]. 

Consider VT a (i) with 1 < i < |T a |, abbreviated as v a (i), which is the 
i-th best MLL sub-codeword in T a , that is, with the smallest discrepancy in 
T a \\Jh^i{Va{h) 4- s a C}. From (4.2), there is a pair j a Q (i) and j a i(i) such that 
1 < jao(i) < \T a o\, 1 < j*i(i) < \Tai\ and 

v a (i) = v a0 (jao(i))ov al (J a i(i)). (4.4) 

The most likely codeword V\(l) is derived from vo(y‘o(l)) and vi(ji(l)) which can 
be obtained in turn recursively by (4.4). 




Adaptive Recursive MLD Algorithm 



35 



In the original (bottom-up) RMLD, T a -table whose entries are (v[D], L[D]) 
for every D E T a is constructed for every index a . For a leaf section I a where \I a \ = 
1 , PaC C { 0 , 1 } and s a C = { 0 } (if the minimum distance of C is 2 or greater), and 
therefore T a -table = {(b,L a (b)) : b is the best in p a C}. For a nonleaf section I a , 
T a -table is constructed recursively from T a &-tables with b E { 0 , 1 } by using (4.2) as 
follows: for D E T a , v[D] = v[{D 0 oDi : D & E T a ^, (Do, D\) is an adjacent pair in 
D}], L[D ] = min(L[Do] + L[Di\), where the minimum is taken over (Dq,D\) E 
the set of adjacent pairs in D. For a = A, since p\C/s\C — {C}, TVtable contains 
the single entry (c ML , L(c M l))- 

Example 1 [5]: Consider the Z-th level section I a of the uniform binary sectional- 
ization for RM r m (the r-th order RM code of length 2 m ) with 0 < l = \a\ < m, 



s a RM r 



i RM m i n | r?m 




(4.5) 


f RM r _ l^m — h 
\{0}, 


for r > /, 
for r < l. 


(4.6) 



Then, 



log 2 I T a\ = !og 2 |PaRM r , m | - log 2 |s a RM 

min {r,m—l} 

= E 

i=max{r- Z+1,0} 



m — l 
i 



(4.7) 



A 

This example shows that for RM codes with the uniform binary sect ionalizat ion, 
T a are identical. This holds for a class of codes, including RM codes and EBCH 
codes, called binary transitive invariant. The definition of the class is given in 
Appendix A. For a binary transitive invariant code, the following lemma holds [6]. 

Transitive Invariant Lemma: Suppose that C of length 2 m is binary-transitive- 
invariant and the uniform binary sectionalization is used. Then p a C ( or s a C) is 
the same for every section I a of the same level |a|. A 

Now, we explain the main idea of top-down and adaptive RMLD. 

(T) Introduction of call-by-need approach: Simulation results [4] show j a o(i) <C 
|T a0 | or j a i(i) |T a i| in (4.4) for almost all cases of relatively small |a| and i , as 
the law of large numbers suggests. To make effective use of this fact, we reorganized 
the recursive search procedure in the original RMLD by a call by need approach. 
The simulation results [4] show that the computational complexity in terms of the 
number of addition equivalent operations can be remarkably reduced. In contrast 
to the original RMLD, the top-down RMLD requires only a very small portion of 
T a -tables to be constructed in average. 

(A) In the top-down RMLD [4], the MLL decomposition (4.2) was still used. We 
have noticed that those recursion levels whose complexities are dominant are a few 
higher levels with nonzero small |a|'s. A weak point to use partition T a = p a C/s a C 




36 



Tadao Kasami 



is that the decreasing rate of block size |s Q C| as |a| increases is much greater than 
that of \p a C\, this is, the decreasing rate of \T a \ is smaller than that we expected. 

For a top-down type algorithm based on call-by-need approach, it is essential 
to make use of an effective early termination condition. In general, such a condition 
is more effective for a partition with a relatively large block size. 

(Al) From the above considerations, the adaptive RMLD presented in this pa- 
per adopts the coarsest parallel concatenation decomposition (3.4) as a basis of 
recursion. For a linear subcode of A of p a C , 

A= (J (p 0 A/s 0 A)(id) o (p 1 A/siA)(id), 

ideId A /( s qAos! A) 

where Id A/{soAosiA] = Id PbA/SbA with b € {0, 1}. Define 

PF a 4 Pa C/(s 0 (p a C) o si ( Pa C)). (4.8) 

Except for a = A where PF a = PT a , the number of blocks in the above decom- 
position where A = p a C, \PF a \, is considerably smaller than |PT a |, as shown 
in Table 1 for RM codes and several EBCH codes of length 128. Consequently, 
the worst case search spaces of recursively called subprocedures can be reduced 
effectively. 

(A2) From the decomposition (3.8) or (3.9), once v a o(j Q o(i)) with j a o(i) < j a i(i) 
in (4.4) has been found, the counterpart v a i(j a i(i)) can be found in a very simple 
way. 

(A3) In the adaptive RMLD, a new sufficient condition that a currently best 
candidate is the optimum at the current level of recursion is used as an early 
termination condition of the recursion (refer to Sec. 5). 

(A4) Preliminary simulation results [12] for the (128, 64) RM code show the adap- 
tive RMLD presented in this paper provides a considerably smaller average de- 
coding complexity than the original RMLD [3, 6] and top-down RMLD [7, 8] in 
terms of the number of addition equivalent operations. 



Table 1: The dimensions of PT a and PF a for several RM and EBCH codes of 
length 128 



\a\ 


RM(128, 64) 


EBCH(128, 57) 


EBCH(128, 64) 


EBCH(128, 71) 


EBCH(128, 79) 




PT a 


PFa 


PT a 


PF a 


PT a 


PF a 


PT a 


PF a 


PTa 


PF a 


0 


20 


20 


27 


27 


34 


34 


41 


41 


34 


34 


1 


30 


10 


40 


10 


47 


13 


54 


6 


44 


6 


2 


24 


4 


26 


4 


31 


1 


31 


1 


29 


1 


3 


15 


1 


15 


1 


16 


0 


16 


0 


16 


0 


4 


8 


0 


8 


0 


8 


0 


8 


0 


8 


0 



Example 2: For RM codes, we use binary polynomial representation. For 0 < 
r < m, let P r?m denote the set of binary polynomials spanned by monomials of 
degree r with m binary variables. P r?m is the set of coset leaders of cosets in 
RM rjm /RM r _i )m . In (3.4), let A = p a RM r rn . It follows from (4.5) and (4.6) that 





Adaptive Recursive MLD Algorithm 



37 



for 0 < l = \a\ < m and b G {0, 1}, 

PabRM-r,m / — l}, m — /—l ? (4.9) 

S6(Pa:RM r?m ) RM m in{r,m— l} — l,m— l— 1 5 ^ ^ 1}- (4.10) 

Suppose r < m — l. Then, min{r, m — 1} = min{r, m — l — 1} = r and the set of 
coset leaders of cosets in p a 6RM r?m /s b (p a RM r?m ) is P r>m _/_i, and Eq.(3.4) can 
be expressed as 

p Q RM r>m = |J {/ + so(p a RM rim )} o {/ + si(p Q RM r , m )}. (4.11) 

f£Pr,m-l- 1 

Prom the definition of PF a for C = RM r>m , we have that 

(4-12) 

For r ^ 7Ti l , p a bT{^A r rn — Sb (p a RM r?m ) = RM m _i_i )m _i_i — and 

log 2 | PFa | = 0. 

For comparison, |PT a | = \p a C/(s a0 (C) o s Ql (C'))| for C = RM r , m is 
log2 |PaR-M r?rn |/|sQ,()R]Vl r 

r—l—1 

i = 0 
0, 



log 2 | 



min{r,m— /} 

- £ 

i=0 



1 r,m| 

m — l 
i 



- < 



m — l — 1 



, for r > /, 
for r < l. 



( min {r,m—l} 



= < 



E 

i—r- 
r,m - 

E 

i=0 



i=i — l 

min{r,m— 1} 



m — l 
i 

m — l 
i 



m — l — 1 
r — l — 1 



, for r > /, 
for r <1. 



(4.13) 



A 



5. Search Procedure for u T (z) 

Let I a b with b G {0, 1} be a nonleaf section. Let A, A' and B denote linear codes 
over I a such that 



SaC CBCACA'C PaC. 

For D G A'/A, define 



(5.1) 



T = D/B. (5.2) 

We introduce procedure pick(T) which returns Vr{i), L q (^t( 0) and idr(vT(i)) 
at the ith call with 1 < i < \T\. La{vr{i)) and idT{vT(i)) are abbreviated as 
Lt(i) and id T {i ), respectively. Since s a C C B, vr(i) is an MLL sub-codeword 
in T. By definition, idT(i) / idr{j) for i ^ j. D denotes the search space of 




38 



Tadao Kasami 



pick(T), B specifies the search units (blocks) of pick(T), and (\B/s a C\ — 1) MLL 
sub-codewords in each coset of T except for the best one can be ignored. 

We present a recursive implementation of pick(T) based on the decomposition 
(3.9). From (3.9), we have that 

D = [J (p 0 D/s 0 A)(id Do o id) o (piD/siD)(id Dl o id ), (5.3) 

d A / (sq Aosj A) — A 

where ido b — id PbA '/ PbA (p b D). We define 

PF(D,A)±D/(s 0 Ao Sl A), (5.4) 

F b (D, A) 4 p b PF(D, A) = Pb D/s b A, for b e {0, 1}, (5.5) 

where parameter D may be replaced by u G D or id \> j A (D), and if there is no 
possible confusion, then parameters D and A may be omitted, and PF(A) = 
A/(sqA o s\A) and F b (A) = p b A/s b A. Define idp = id A t/ A (D). From (5.3) to 
(5.5), 

D= (J PF(D, A)(id D o id), (5.6) 

idfzIdpp^A) 

PF(D,A)(ido oid) = Fo(D, A)(idD 0 o id) o Fi(D,A)(ido 1 ° id), (5.7) 

F b (D,A)(id Db o id) = p, VbA > /p b A(id>D b ) + VF b (A){id) + s b A. (5.8) 

If D = A, then ido — ido b = A. 

5.1. Recursive Implementation of pick(T) Based on 

the Coarsest Parallel Concatenation Decomposition 

There are two cases to be considered. 

(Case I) B = A,T — { D }. For this case, T consists of a single coset in A' /A, and 
pick(D) is called only once. For example, let a = A and A! — A — B = C. Then, 
D = C e T\ = C/C , and PF = C/(s 0 C o Sl C), F b = p b C/s b C = T b and pick(D) 
is an example of Case I. Another example is pick (T b (id)). 

(Case II) B / A, D ^ A. For this case, a ^ A from (5.1). Assume that 

s a C C soA o s\A . (5.9) 

Since s a C C B, the relation (5.9) holds if the following relation is true: 

BCsoAosiA. (5.10) 

Relation (5.10) with A = p a C and B — s b (p a / b C ), that is, 

Sb(p a /bC) C s 0 (p a C) o S\(p Q C), (5.11) 

holds for C = a RM code or an EBCH code of length 128 and dimension 57, 64, 
71 or 78, with the uniform binary sectionalization. For the EBCH codes, (5.11) 
is verified by constructing generator matrices for s b (p a C) and s b (p a / b C). A proof 
of (5.11) for RM r rn is as follows: Since (5.11) holds for r = 0, assume that r > 




Adaptive Recursive MLD Algorithm 



39 



1. Note that p a RM r _i ?m C p a0 RM r _i ?m o p al RM r _i ?m . From (4.5) and (4.6), 
p a RM r _i )m = s h (p a j b RM r?m ) and p afc /RM r _i )m = s b /(p a RM r)m ) with b' e {0, 1}. 
Hence, (5.11) holds for C = RM r5m . 

In contrast with Case I, pick(T) may be called two or more times in Case II. 
Hereafter, we consider Case II. Case I is a special case where \T\ = 1. We consider 
the processing made by pick(T) at the hth call with 1 < h < \T\. Suppose v T (i), 
Lt{i) and id F (i) with 1 < i < h have been found and returned to the parent 
procedure. For id G Id PF ^ A ), define 

p h (id) = |{vt(0 • Vt(i) C PF(ido o id) and 1 < i < h} |, (5.12) 

h-i 

PF{id) h A PF{id D o id)\{ (J (v T {%) 4- B)}. (5.13) 

i= 1 

In pick(T), subprocedure pick(F^) with b G {0, 1} is called by need which returns 
v F b (i), L Fb (i), id Fb (i) together with id F ^(i) at the ith call, where F^ is defined 
in 5.2. 



(1) Suppose pick(F^) has been called i b times with b G {0, 1} and 1 < i b < \F b \, 
and v Fb (i b ), L Fb (i b ), id Fb (i b ) and id F ^(i b ) with 1 < i b < ib have been found. Define 



IPf — {id,D b \idF b (ib) ■ 1 < *6 < ib and b € {0, 1}}. 

From (2.7) and (3.6), IPp C Id F(A} = Id FFiA] . Define cS F and cS' T as 


(5.14) 




cS T = (J PF(id) h , 

id£lPp 


(5.15) 


From (5.6), 


cS' T = (J PF(id) h . 

id£ldp F (A) \IP f 


(5.16) 


D= (J PF(id D oid). 

id^Idp p(A) 

From (5.13) to (5.15), 


(5.17) 




h-L 

D\{\jT(v T (i))} = cS T UcS' T . 

i=l 


(5.18) 


Define 








Cbest — v[cSt] = the best of |^J v[PF(id)h\- 

id£l Pf 


(5.19) 



Hence, v F (h) is either Cb es t or v[cS ' T ]. 

Sufficient conditions that v p (h) = Cb es t 

If the following (TCr-1) or (TC t - 2) holds, then v T (h) = c bes t . 

(TC t - 1) L Fo (i 0 ) + L Fl (i\) > L(cbest), 



(5.20) 




40 



Tadao Kasami 



(TCt- 2) \IP f \ = \PF(A)\ = \A\/(\s 0 A\ • \ Sl A\). (5.21) 

Proof, (i) If (TCt- 2) holds, then v[cS P ] is not defined. 

(ii) Suppose (TCt- 1) holds and \IPf\ < \PF(A)\. For id G Id PF (A) — — 

Id Fl (A )> it follows from (5.7) that 

PF(id p o id) = Fo(id £> 0 o zd) o Fi^d^ ° id). (5.22) 

For n G c5^, there exists id G Id PF {A)\IPF such that pbU G Fb(idr> b ° id) with 
6 G {0, 1}. Since L(n) > L Fb (ib) for any t; G Fb(id Pb 0 id) with id G IdF b \IPF , 

L(p b u) > L Fh (ib )• ( 5 - 23 ) 

Hence, L(n) > ^f 0 (^o) + Tfi(h) > P(cbest)- A 



If (5.10) holds, then the following (TCt-P) is a corollary of (TCt- 1)- Without loss 
of generality, suppose ib is updated after i F 

(TCt- 1 ; ) idF b (ib) = ido b ° id with zd G JPf- 

Proo/. If (TCt- 1') holds, then there is Z 5 such that 1 < ii < H an d ^F 6 fe) — 
idD- b 0 id. From (5.22), t; = VF b (ib) 0 v F b (ib) ^ PF(id p o id) and v is the best in 
PF(id p o zd). Since any n G D\PF(id p o zd) belongs to some coset in D/ B other 
than that containing v from (5.10), v may be output already as vj'(i) with i < h. 
Otherwise, since L/^fe) > L F - b (ib), L Fo (io) + pFi(ii) > P(v) > T(cbest)- A 
(2) If z;[PP(zd)h] is found for every id G /Pf, then Cbest can be obtained from 
(5.19). We introduce pick (PF(id)h) with id G IPf which returns the MLL sub- 
codeword n with the smallest discrepancy in PF(id)h and its discrepancy. If 
PF(id)h = 0, then 0 is returned. 

Assume that the relation (5.10) holds, that is, B C soA o si A. Then 
pick (PF(id)h) can be reduced to pick (PF'(ido o zd)), where 

PF f (id D o id) = PF(id D o id)/B. (5.24) 

Procedure pick (PF f (id p 0 id)) returns u = v[PF(id)h \ , its discrepancy and 
i d P F'(id D oid)(u ) at the ( ph(id ) + l)th call, if PF(id)h ± 0. For a recursive imple- 
mentation of pick (PP'(zdF) ozd)), a subprocedure of type pick (pbPF(id p o id)/ Eb ), 
where b G {0, 1} and Eb is a linear subcode of pbPF(id p o zd) = Fb(idr> h 0 id ), 
is called by need from pick(PF'(zd£> o zd)). We choose SbB as Eb which is the 
maximal solution under the condition of Eo o E\ C B. Define 

F/(id Db o id) 4 F b (id Db o id)/ s b B. (5.25) 

As a result, pick(P F f (id f> ozd)) makes use of the following relation based on (5.22). 
For id G /d PP (A), 

PF(id p o id)/(soB o s\B) = F^idoo 0 id) o P^zd^! 0 id), (5.26) 

is a refinement of PF'(idD°id) = PF(id p °id) / B. For given no G Fo(idT>ozd) and 
ni G P^zdD o zd), a simple procedure to decide if there is n p (i) E PF(ido 0 id) 
with 1 < z < h such that no o m + v F (i) E P is shown in Appendix B. 




Adaptive Recursive MLD Algorithm 



41 



Figure 2 shows the call-return relation among pick(T), pick(F&) and 
pick (PF(id) h ). 

vr{h ), LT(h),idT(h) 

/ith call condition (TC-l) or (TC-2) holds, 

once* 

pick(T) _ pick(PF(id)h) (id, = id Db \id Fb (i)) 




T = D/B,PF = D/(s 0 Ao si A), F b = p b D/s b A 

h - 1 

PF(id) h = PF(id D O id)\{ (J (v T (i) + B)} 

i = 1 

* once for a return from pick Fb with id Fh {i) = id F o id 
Figure 2: The call-return relation among pick (T), pick (F&) and pick (PF(id)h)- 

Example 3: For o = {0,l}*, let A' = A = D=p a C. Then, id D =id Do =id Dl =X. 

(i) For o = A, B = C and T = C/C = {C}. Then PF(C) = C/(s 0 C o s x C), and 
Fb(C) = pbC/sbC = Tb with b £ {0, 1}. Since C D soC o s\C, (5.9) does not hold 
in general. 

(ii) We introduce the following new abbreviated notations for a £ {0, 1}*: 

PF a A PF(p a C) = p a C/(s 0 (p*C) o Sl (p a C)), (5.27) 

F a ,b = Fb{p a C) = PabC/s b (p a C ), for b £ {0, 1}. (5.28) 

For s a C C B C p a C, pick(p a C/F) can be implemented by calling pick(F a ^) 
and pick (PF(id)h), where id £ Idp Fa = /cJf q b = 5 is a return value from 

pick(F a? b) at the i b - th call. 

(ii- 1) If Ph(id) = 0, then v[PF(id)h] = v[PF q (mO] and v Fab (i b ) = v[F a , b (id)\. 
From (5.7), 



v[PF a (id)\ = v Fab (i b )ov[F a - h (id)} 



(5.29) 




42 



Tadao Kasami 



where F a i(id) = pF a 5 (id) + $ 5 (p a C). In (5.29), v[F a ~ b (id)] can be found by calling 
pick(F Q ^(id)) once. For a = A (Case I), ph(id ) = 0 and it is sufficient to consider 
pick (PF a (id)) with |a| > 0. 

(ii.2) If ph(id) > 1, then pick (PF(id)h) is reduced to PF'(p a C)(id). Let B = 
s b(Pa/bC) for a G {0,1}*6 with b G {0,1}. PF'(p a C)(id) is abbreviated as 
PF^id): 

PF' a {id) ± PF f (p a C)(id) = ( p PFa (id ) + (s 0 (p*C) o s^C))) / s b (p a/h C), (5.30) 

On the complexity of pick(T) 

The complexity of pick(T) depends on the average of zq + i\ for which (TCr-1) 
holds for given h. By definition, z 0 + i\ = |/Pf| + the number of occurrences of 
(TCr-1') to hold, which reduce the computational complexity. Furthermore, for 
a G {0, 1}* and 6 G {0, 1}, h is not greater than the ib of the parent procedure. As 
is shown in Table 2, the upper limit of ib , \F a .b\ = |Pa6C , /s6(p Q C')|, is smaller than 
that of the MLL decomposition, \T a b\{= \PabC / s a bC\), except for a = A where 
Tab — F ry b' 

Table 2: The dimensions of T a b and F a ^ for several RM and EBCH codes of 



length 128 



H 


RM( 128, 64) 


EBCH( 128, 57) 


EBCH( 128, 64) 


EBCH( 128, 71) 


EBCH( 128, 79) 




Tab 


F a ,b 


Tab 


F a,b 


Tab 


F a .b 


Tab 


F a .b 


Tab 


F a ,b 


0 


20 


20 


27 


27 


34 


34 


41 


41 


34 


34 


1 


20 


10 


24 


10 


30 


13 


30 


6 


25 


6 


2 


14 


4 


15 


4 


16 


1 


16 


1 


15 


1 


3 


8 


1 


8 


1 


8 


0 


8 


0 


8 


0 


4 


4 


0 


4 


0 


4 


0 


4 


0 


4 


0 



For a linear block code F, let dn{E) denote the minimum distance of E. 
Since (poA) o (p\A) D A D SbA o {0} with b G {0, 1}, 

dn{poA) + dn(piA) < dn(A) < dn{sbA). (5.31) 

Suppose A is binary transitive invariant and the uniform binary sectionalization 
is used. Then p^A = p\ A, and therefore, 

2dn{pbA) < dn{sbA). (5.32) 

Note that dn(pbA) is the minimum distance between different cosets in Fb and 
dn(sbA) is the minimum distance of each coset in iv For relatively small \a\,h 
and ib < \Fb\, the possibility that there exists better u in PbD\\J l i l\ b Fb(vj? b {i)) 
than vp b {ib) is small in average. 

5.2. Outline of pick (PF f (id d ° id)) 

Note that pick (PF'(ido ° id)) with id G IPf is called in one of the following 
situations: 

(i) id is just registered to IPp and v[PF(id)h] is to be put in cSt as the first 
representative from PF(ido o id)(= PF(id)h)- 





Adaptive Recursive MLD Algorithm 



43 



(ii) pick(T) has returned v PF ^ idDOid ^(ph(id)) as v F {h-l) and pick(P) is currently 
called to find v F {h). Then, for finding a new Cb es t in (5.19), v PF ^ i d DO id)(ph(i'd)Fl) 
is required to make up v T (h - 1) = v PF , {idDOid) (p h (id)). 

Subprocedure pick (F 6 '(zdD b o id)) is called by need to obtain v F ^ idD Q i d ){jb) 
at the jb th call. When pick(F&) at the zth call returns v Fb (i ), L Fh (i) and id = 
id Fb \id Fb (i) together with id F ^ idc>b0id ^(i), that is, in the above situation (i), 
v F£(id Db oid){^) = v F b (i )• That is, the first call of pick (F b '(zdc> b ° id)) simply refers 
to the return value of pick(F&) at the zth call. 

To make use of the structure (5.26), we introduce new notations. 

Define a set of ordered pairs of positive integers P = {(io, i\) : 1 < ib < — 

\s b A\/\s b B\ for be {0,l}},t>(i 0 ,ii) = v F ^ idDo0id) {i 0 )ov F ^ idDi0id) (ii) for (* 0 ,*i) 6 
P, P Ph (id) - {(*o,*i) e P : «(i 0 ,ii) = v PF f {idDOld] (i) for 1 < i < Ph(id)}, and 

Pph(id) = P\Pp h {id)- 

We introduce the following partial order “<” into P : 

(*o,n) < (io,z'i) <(=> i 0 < i f 0 and i\ < i\. 

For p = (i o, i\) and p f = (ip, z' x ) in P, we write p < p' iff p < p f and p ^ p' , and 
p\p f iff p ^ p' and p ' ^ p. For p G P, define K,p = {p f G P : p < p'} and let dP ph ^ id ) 
denote the set of minimal pairs in P ph (i d )• Then, 

(91) for p and p' in dP Ph{id) , p\p', 

( 9 2) P Ph (id) = (J *P- 

P£dPp h (id) 

Define L(io,ii ) = L(v(io,ii)). Then, for p < p' in P, Lp < Lp' 3 . Hence, v(io,i\) 
is a candidate for v[PF(id)h] only if (io, i\) € dP Ph p d y From (91), we can number 
the pairs in dP Ph ( id ) as follows: 

(93) dP Ph{ld ) = {(i^,*^) : 1 <j<$= \dP Ph (id)\}, where 

VQ > Iq" 1 > 1q 2) > . . . > and < ... < < v\. 

Here we assume that v F/j(ldDo0ld) (j 0 ) with 1 < jo < = jo and v F;(ldDj0ld) (j 1 ) 

with 1 < ji < i[ 6 ^ = ji have been computed by pick(Fb) or pick(F d (ido h o id)) 
and {Lp : p G dP Ph (id)} is listed, e.g., by a priority queue. From (92), there exists 
unique j m such that 1 < jm < $ and 

(94) v(i ( 0 ? m \i? m) ) = v[dP f>h(id) ]. 

There are two cases: 

Case 1: If id PFf(idQOid) (v{i { 0 Jm \i[ 3m) )) £ {id PF , {ido0id) (v T (i)) : v T (i) G PF f (id 0 o 
id) and 1 < i < ph(id)}, that is, v(iQ m \i^ m ^) G PF(id)h by using the deci- 
sion algorithm in Appendix B, then output v(4 Jm \ 4^) as v[PF(id)h\, delete 
(iQ m \i^ m ^) from dP Ph ^ id ) and return. 

3 Refer to footnote 2. 




44 



Tadao Kasami 



Case 2: Otherwise, delete (zq 7 ^, i^ m ^) from dP Ph ^ d ) and update dP Ph ^ id ) (if nec- 
essary). 



For Case 1, 5P new Ph (i d ) is updated at the next call to pick (PF'(zd£> ° id)) with 
new ph(id) = current ph(id)~ hi. For Case 2, dP Ph ( id ) is to be updated, if necessary. 
Note that for (zo,zi) £ P , 

(55) K(io,ii)\{(io,ii )} = «(i 0 + Mi)(for i 0 < vo) U /c(z 0 ,ii + l)(for ii < v{). 
Hence, in order to meet (51) and (52), each of the following pairs need to be added 
to dP ph ( id ) \ {(zo Jm \zp rn ' ) )} under the following specified conditions: 

(i) (Iq™^ + 1 ,4^), if (a) either j m = 1 and z^ < u 0 or (b) j m > 1 and - 

Ajrn) >> c\ 

*0 — 

(ii) (4 im) ,z ( i J ’ m) + 1), if (a) either j m = $ and z^ < v\ or (b) jm < S and z^ m+1 ^ — 

i[ jrn) > 2. 

For the case (a) only, pick (F^id^ o zd)) is called to find v F ^ idn> oid)(jb + 1). 




vo 



Figure 3: Illustration of P, P Ph (id), P Ph (id ) and dP Ph{id y 

In Figure 3, suppose (z^^z^) or (4 5 ^4^) * s deleted. Then no new pair is 
to be added. If (zq 3 \z^) is deleted, then (z^ + l,z^) and (zq 3 \z^ + 1) are to 
be added. If (zq 4 \z^) is deleted, then (i^ + l,z^) is to be added. 

Then, \dP Ph ^ id )\-l < updated dP Ph{id ) (current or updated) < |5P Ph( ^)| + 
1. If either j m = 1 and z^ < i/ 0 or j m = S and z^ < v x , then v F ^ idDQ oid) (4^ + *) 
with 4 1} = jo or v F >( idDi0id) (i[ 6) + 1) with i[ 6) = ji is to be found, respectively. 




Adaptive Recursive MLD Algorithm 



45 



Example 4: For a G {0, 1}*6 with b G {0, 1}, A = p a C and B = s b (p a / b C). 

(i) Subprocedure pick (PF^(id^)) with id ^ G Idpp a = ldp a bl for b\ G {0,1}, 
calls pick (F^ bi (id^)), by need, which is processed in turn by its subprocedures 
on descendant subsections. F ' f a bl (id^) is the following abbreviation: 

Km ( id {1) ) = (j* Fa , bl + s bl (p a C))/s bbl 0 p a/b C ). (5.33) 

Table 3 shows the dimensions of PF , a {id){ 5.30) and F £ b (id){ 5.33) for several RM 
and EBCH codes of 128. 



Table 3: The dimensions of PF^(id) and F' a b (id) for several RM and EBCH 
codes of length 128 



|q| 


RM(128, 64) 


EBCH(128, 57) 


EBCH(128, 64) 


EBCH(128, 71) 


EBCH(128, 79) 




p K 


P a,b 


p K 


K,b 


p K 


K,b 


PF’ a 


P a,b 


p K 


K, b 


1 


10 


10 


17 


15 


21 


17 


35 


24 


28 


19 


2 


6 


6 


6 


6 


12 


10 


5 


5 


5 


5 


3 


3 


3 


3 


3 


1 


1 


1 


1 


1 


1 


4 


1 


1 


1 


1 


0 


0 


0 


0 


0 


0 



(ii) For convenience, we extend the notations PF a and F aib as follows: 

For a and (3 in {0, 1}* and b G {0, 1} such that I a p is a nonleaf subsection of I a , 

PFa,0 = S0(p a C)/(s 0O (p a C) O SpxipaC)), (5.34) 

Fa, 0b = Sp b (p a C)/sp b (p a C). (5.35) 

Abbreviate PF a ^\ as PF a . From (5.34) and (5.35), 

PbP F a ,p = F at p b . (5.36) 

(iii) Subprocedure pick (F^ bi (id^)) can be implemented as the Case II in 5.1, 
where A! = p abl C,A = s bl (p a C),D = p bl p PFa (id ( ' 1 '>) + A = p FaM (id (1) ) + A and 
B = s bbl (p a / b C), by subprocedures pick (F atblb2 (id^\ —)) with 62 € {0,1} and 
pick(PF^, bi (jd ( 1 ) ,id (2) )), where *d (2) e Id F b2 (id( i),-) i s one °f return values 
from pick(F aj b 1 ; )2 (id^,—)), 

F a Mb 2 {id {l) ,-) = (Pb 2 PF a , bl (id {1) ) +p b2 s bl (p a C))/s blb2 {p a C), (5.37) 

PF’ aM (id^\idM) 4 (p 6 lMPFa (id<i>) +p PFa bi{id < 1)) (idW) 

+s bl o(p a C) O s bl i(p a C))/s bbl (p a/b C). (5.38) 

Refer to Figure 4, where 

F a , blb2 (id^\id^) 

- P bl b 2 PPF a {id {1) ) +Pt 2 MPF Q , 6 l (id (2) ) + s blb2 (p a C)/s bblb2 (p a/b C). (5.39) 





46 



Tadao Kasami 



(Ph{id {l) ) 

+l)th 



' PF, 



c (id(F) (ph(id {1 '>) + 1 ), ph{id {1) ) > 0 



pick {PF' a {id^)) 



v F' abi (id^){jbi) 



jb 1 th 
by need 

frl £ {0^1} || v PF a ' bl (idF) 

Pi c KK,bS id(1) )) * pick {PF' aM {id^\id^)) 

ib 2 th 
by need 

id = id 



± «, J *’ th 



Fa,bi f>2 ( u ) 



by need 

b 2 e {0,1} 



V F a , b j b2 (id* 1 ) ,id< 2 ) ) 0^2 ) 

pick(F Qi 6 l f, 2 (id ( 1 ) ,id (2) )) 



£>2 € { 0 , 1 } 

pick(F Q , 6 l( , 2 (icf (1 ),-)) 

FF'^D): (5.30), 

K M ( id{1) y- ( 5 - 33 )- 

F aMb2 (idW,-): (5.37), 

PFL M (id^,id^)): (5.38), id< 2 > € = Wf ifciia(w(1)) , 

F aMb2 (id^ (5.39). 

Figure 4: The call-return relation among pick(FF f {(j(i 11 - 1 )), pick(F ( { fci (id^)), 
pick(F F' a bi (id^, id^)), pick (F Qif , lfc 2 (id (1) , -)) and pick(F Q ,b l b 2 (id( 1) ) jd (2) )). 



6. Preliminary Simulation Results [12] 

Figures 5 and 6 [ 12 ] show the simulation results of block error probabilities and 
average numbers of addition equivalent operations (AEO) for the (128, 64, 16) 
RM code, respectively. The number of AEO for the code by the standard Viterbi 
decoding with 128 sections is 16,897,966,073. As compared with the previously 
presented top-down RMLD for RM 37 [7], [ 8 ], where the bit positions are permuted 
so that the left half 64 bits form the most reliable basis [9] , the simulation range by 
the proposed algorithm has been extended to O.OdB and the decoding complexity is 
significantly reduced. These are good indications of the effectiveness of the coarsest 
parallel concatenation decomposition technique to make use of the fine structure of 
the target code. Simulations for some EBCH codes and RM codes and the detailed 
analysis of decoding complexity are under study by coworkers. 




Adaptive Recursive MLD Algorithm 



47 







II 

PQ 




Figure 5: Block error probability for RM3 ? 7. 



Appendix A: Binary Transitive Invariant Codes [6] 

For a positive integer m and a nonnegative integer j less than 2 m , represent j in 
a binary expression as 

3 = JJ2"*- 1 + j 2 2 m ~ 2 + ...+j m , ji €{0,1} for 1 < i < m. 

There is a one-to-one mapping <p m from the set of binary polynomials with m 
variables, P m , to V 2 such that 

<Pm(f) = {Ul,U 2 ,. . . ,U 2 ™), (A-l) 

where u j+ 1 = f {31,32, ■ ■ ■ ,j m ) for 0 < j < 2 m . 

A binary block code B of length 2 m is binary transitive invariant, if and 
only if for any / € P m and (61,62, — , 6 m ) € V m , <Pm(f(xuX2,-..,x m )) € B O 
<Pm(f{x 1 + 61, x 2 + 62, . . • , x m + 6 m )) G B. RM codes and EBCH codes are binary 
transitive invariant. 





48 



Tadao Kasami 




Figure 6: Average numbers of AEO. 



Appendix B: A Decision Procedure Where u + v G B for u, 

v G s 0 A o siA in (5.10) 



We can choose a generator matrix Gb of B of the following form: 



G b 



Gb , o i 0 
■ 0 _ IG_bj 
Gb, o,i 



(B-l) 



where Gs,b with 6E{0,l}isa generator matrix of s^B and Gb, o.i is a generator 
matrix of [B/ (soBosiB)]. As stated for (3.5), there is a one-to-one correspondence 
between the sets of rows in PoGb,o,i and PiGb,o,i , respectively. From (B-l) and 
(5.10), we can choose a generator matrix Gb of SbA of the form: 



and submatrix 






PbGs, o,i 
r»(i) 

L U b 



& [ PbGs, o,i 
Gl = [ 





Adaptive Recursive MLD Algorithm 



49 



is a generator matrix of [sbA/sbB]. For idb E Id Sb A/ Sb Bi i&b can be partitioned into 

two subsections idby and idb , 2 corresponding to submatrices PbGp, 0,1 and G^\ 
respectively. 

Gb,o 0 

0 G b ,i 

_ PoGb,o,i 0 

s 0 Ao Sl A = | 0 PlG B ,0,l 

0 



Note that G. 



(i) 



Define G' - 



r (i) 

0 



Gy 

0 

PiGb,o,i 

0 



G\ 



(i) 



is a generator matrix of sqAosiA. 



<A 



(i) 







. Since 


1 

S3 <3 
1 



is derived from G So Ao Si a by row 



operations, G' is a generator matrix of [(sqAo siA)/B]. For id E Id^ SQ A OSl A)/ b, id 
can be partitioned into three subsections id \ , id 2 and ids corresponding to subma- 
trices of G', [0 ,piGb,o,i], [Gq 1} , 0] and [0, G^], respectively. It follows from (3.6) 
and the definitions of G' b and G' that for Ub E SbA , idi(uo o u\) = idi,\(ui), 
id 2 (uo oui) = ido^(uo), and ids(uo o u\) = id\^{ui). Consequently, u — u^oui 
and v = voovi , where Ub and Vb are in SbA , are in the same coset in (sqAosiA)/ B, 
iff 



*di,i( u i) = id i,i(vi), 


(B-2) 


id b fab) = idb, fab )■ 


(B-3) 



Acknowledgment 

The author is grateful to a reviewer and Drs. T. Fujiwara, Y. Kaji and T. Koumoto 
for their valuable suggestions to improve the manuscript, and Dr. H. Tokushige 
and Mr. I. Hisadomi for their help in preparing the manuscript. 



References 

[1] Y.S. Han, C.R.P. Hartmann, and C.-C. Chen, “Efficient priority first search 
maximum-likelihood soft-decision decoding of linear block codes,” IEEE Trans. In- 
form. Theory, vol. 39, pp. 1514-1523, Sept. 1993. 

[2] T. Kaneko, T. Nishijima, H. Inazumi, and S. Hirasawa, “An efficient maximum- 
likelihood decoding algorithm for linear block codes with algebraic decoder,” IEEE 
Trans. Inform. Theory, vol. 40, pp. 320-327, Mar. 1994. 

[3] T. Fujiwara, H. Yamamoto, T. Kasami and S. Lin, “A trellis-based recursive maxi- 
mum likelihood decoding algorithm for linear codes,” IEEE Trans. Inform. Theory, 
vol. 44, pp. 714-729, Mar. 1998. 

[4] Y. Kaji, T. Fujiwara and T. Kasami, “An efficient call-by-need algorithm for the 
maximum likelihood decoding of a linear code,” 2000 International Symposium on 
Information Theory and Its Applications, pp. 335-338, Honolulu, HI, Nov. 2000. 




50 



Tadao Kasami 



[5] A. Lafourcade and A. Vardy, “Optimum sectionalization of a trellis,” IEEE Trans . 
Inform. Theory , vol. 42, pp. 689-703, May 1996. 

[6] T. Kasami, H. Tokushige, T. Fujiwara, H. Yamamoto and S. Lin, “A recursive maxi- 
mum likelihood decoding algorithm for some transitive invariant binary block codes,” 
IEICE Trans. Fundamentals, vol. E81-A, pp. 1916-1924, Sept. 1998. 

[7] T. Koumoto, and T. Kasami, “Top-down recursive maximum likelihood decoding 
using ordered statistics information for half rate codes,” Technical Report of IEICE, 
IT2002-29, The Institute of Electronics, Information and Communication Engineers, 
pp. 13-18, Japan, Sept. 2002. 

[8] T. Koumoto and T. Kasami, “Top-down recursive maximum likelihood decoding 
using ordered statistics information,” Proc of the IEEE Inform. Theory Workshop , 
pp. 202, Bangalore, India, Oct. 2002. 

[9] M.P.C. Fossorier and S. Lin, “Soft-decision decoding of linear block codes based on 
ordered statistics,” IEEE Trans. Inform. Theory , vol. 41, pp. 1379-1396, Sept. 1995. 

[10] S. Lin, T. Kasami, T. Fujiwara and M. Fossorier, “Trellises and Trellis-Based De- 
coding Algorithms for Linear Block Codes,” Kluwer Academic Publishers , Norwell, 
MA, 1998. 

[11] T. Kasami, T. Fujiwara, Y. Kaji and T. Koumoto, “Adaptive recursive maximum 
likelihood decoding based on parallel concatenation decomposition,” Technical Re- 
port of IEICE, IT2002-66, March, 2003. 

[12] T. Koumoto, Y. Kaji, T. Fujiwara and T. Kasami, “Implementation and simulation 
results of adaptive recursive maximum likelihood decoding,” Technical Report of 
IEICE, IT2003-28, July 2003. 



Tadao Kasami 
Ohata-cho 4-26 
Nishinomiya-shi 
Hyogo, 662-0836, Japan 
e-mail: kasami73@nifty.com 




Progress in Computer Science and Applied Logic, Vol. 23, 51-65 
© 2004 Birkhauser Verlag Basel/Switzerland 



Modularity of Asymptotically Optimal Towers 
of Function Fields 

Wen-Ching Winnie Li 



Abstract. Elkies conjectures that all recursively defined asymptotically opti- 
mal towers of function fields over finite fields with square cardinality arise 
from elliptic modular curves, Shimura curves, or Drinfeld modular curves by 
appropriate reduction. In this paper we review the recursive asymptotically 
optimal towers constructed so far, discuss the reasons behind Elkies’ conjec- 
ture, present numerical evidence of this conjecture, and sketch Elkies’ proof 
of modularity of the new families. 

Mathematics Subject Classification (2000). Primary 14G50, 14G05, 11R58. 
Keywords. Modular curves, optimal towers. 



1. Introduction 

It is well known that for codes over a finite field F with square cardinality q at 
least 49, the algebraic geometry bound of the information rate is better than the 
Gilbert- Varshamov bound. This is achieved by exhibiting families, called asymp- 
totically optimal family, of smooth curves defined over F such that the limit of the 
number of F-rational points over its genus approaches the optimal value yfq — 1. 
Appropriate reductions of elliptic modular curves, Shimura curves, and Drinfeld 
modular curves are shown to yield asymptotically optimal families. Families aris- 
ing from such curves are called modular. For practical purposes, explicit construc- 
tions of asymptotically optimal families are desired. This was first done by Garcia 
and Stichtenoth in 1995, giving recursively constructed towers. To date, there are 
several known recursively defined asymptotically optimal towers, which are all 
proved by Elkies to be modular. Elkies further conjectures that all recursively de- 
fined asymptotically optimal towers over finite fields with square cardinality are 
modular. 



This research is supported in part by the NSA grant MDA904-03-1-0069. 




52 



Wen-Ching Winnie Li 



In this paper we review the recursive asymptotically optimal towers con- 
structed so far, discuss the reasons behind Elkies’ conjecture, present numerical 
evidence of this conjecture, and sketch Elkies’ proof of modularity of the new 
families. 



2. Algebraic Geometry Codes 



Let X be a smooth projective curve of genus g defined over a finite field F of 
q elements. Choose n distinct F-rational points Pi , . . . , P n on X and an effective 
divisor G of X with support disjoint from the PC s. Suppose n > deg G > g. Denote 
by C(G) the finite-dimensional F vector space spanned by the nonzero F-rational 
functions / on X such that divf + G > 0. Consider the F-linear map (p from C(G) 
to F n defined by 

This map is well defined since the poles of / lie in the support of G. Further, 
the condition div/ + G > 0 implies that the total number of poles of a nonzero 
/, counting multiplicities, is at most degG, hence a nonzero / can have at most 
deg G zeros on X. As n > deg G, we see that (p(f) 7 ^ 0 if / ^ 0. In other words, (p 
is an injection. The image of </>, denoted by C — C(Pi, . . . , P n ; G), is a linear code 
over F of length n, called an algebraic geometry code. Its dimension k — dim£(G) 
is at legist degG — g + 1 by the Riemann-Roch theorem. Its minimal distance 
d, which is the least number of nonzero components among nonzero codewords 
in C, is at least n — degG, as explained above. From practical point of view, it 
would be desirable that C has large information rate r(C) := k/n so that it can 
transmit more messages. On the other hand, one would also desire that C has 
large error-correcting rate 8(C) := d/n so that it can correct more errors. These 
two quantities apparently are opposite to each other. A code is said to be good if 
the sum of these two quantities is large. In our case, we have the following lower 
bound for an algebraic geometry code C : 



r(C) + 8(C) > 



n- g + 1 
n 




1 

n/g' 



Therefore, to construct good algebraic geometry codes, we seek curves X defined 
over F whose number of F-rational points, N q (X), divided by its genus g(X) is 
large. In fact, we’ll need a family of curves {Xi} defined over F such that the ratio 
N q (Xi) / g(Xi) is large as i approaches infinity. For this purpose, let N q (g) denote 
the maximal possible number of F-rational points on a curve of genus g defined 
over F, and consider the quantity 



A{q) = hmsnp 5M 

g—> oo g 



first introduced by Ihara in 1981. He showed that [10] 



A(q) > yjq — 1 if q is a square. 




Modularity of Optimal Towers 



53 



Then Drinfeld and Vladut [1] in 1983 proved the unconditional upper bound 

A(q) <y/q-l. 

Combining both, we conclude that 

A(q) = y/q — 1 when q is a square. 

To date, the precise value of A(q) for a nonsquare q is unknown. However, there 
are various lower bounds for such A(q). The reader is referred to [20], [16], [15], 
[18], and [12] for more information. 

For the remainder of this paper, q is assumed to be a square. A sequence of 
curves {X*} defined over F with genus g(Xi) approaching oo and N q (Xi)/g(Xi) 
approaching A(q) = y/q — 1 as i increases to infinity is called an asymptotically op- 
timal family of curves. Ihara obtained the aforementioned lower bound for A(q) by 
exhibiting an asymptotically optimal family of Shimura curves. In [19] Tsfasman, 
Vladut, and Zink exhibited an asymptotically optimal family of modular curves. 
These families are not explicit in the sense that the defining equations of these 
curves are not explicit. 

Instead of viewing a curve X geometrically, one may regard it algebraically 
by studying the associated field of F-rational functions on X, called the function 
field F(X) attached to X. The F-rational points on X correspond to degree one 
places of F(X), and the genus of F(X) is equal to the genus of X. Conversely, given 
a field F over F with transcendence degree one, there is a smooth projective curve 
X defined over F, unique up to isomorphism, such that F is its function field. In 
the event that {X*} is a sequence of covers, the associated function fields (F(X*)} 
form a tower under inclusion. Given a function field F over F, we can compute the 
ratio of the number N(F) of places of degree one versus its genus g(F) and call 
a family of function fields {i^} over F bad , good , or asymptotically optimal if the 
limit of N(Fi)/g(Fi ), as i approaches infinity, is 0, nonzero, or A(q), respectively. 
Most of the families are bad. Asymptotically optimal families are rare. In what 
follows, we shall discuss explicit asymptotically optimal towers which are defined 
recursively and Elkies’ modularity conjecture. 



3. Recursively Defined Towers 

By a recursively defined tower over F we mean a strictly increasing tower T of 
function fields 

F\ C F2 C F3 • • • (3.1) 

satisfying the following conditions: 

1. Each Fi is a function field with field of constants F; 

2. Fi+i is a finite separable extension of Fi for all i > 1; 

3. The genus g(Fi) of Fi is greater than 1 for some i\ 




54 



Wen-Ching Winnie Li 



4. F\ is the rational function field F(xi), iq+i = Fi(xt+i) for i > 1, and there is 
a rational function f(X , Y) in variables X and Y with coefficients in F such 
that f(xi,Xi+ 1 ) = 0 for i > 1. 

Clearly the fields Fi are defined by explicitly given equations. Moreover, the re- 
cursive defining equation facilitates the study of the splitting of degree one places 
in its immediate superfield, which provides a lower bound of the growth of the 
number of the places of degree one, and the study of the ramification in each con- 
secutive extension, which, combined with the Hurwitz genus formula, describes 
the growth of the genus of fields. Recall that the limit as i goes to infinity of the 
quotient of the number N(Fi) of places of degree one by the genus g(Fi) for each 
Fi tells us how good the tower is. 

Remark. For the sake of simplicity, we restrict ourselves to adding one variable 
and satisfying one recursive relation at each step. We shall see later an example of 
adding two variables and satisfying two relations. Obviously it extends to adding 
m variables and satisfying m conditions. 

Exhibited below are a few examples of recursively defined asymptotically 
optimal towers whose field of constants F has square cardinality. 

1. The first such tower was given by Garcia and Stichtenoth [6] in 1995 over W q 2 
with the recursive polynomial 

f(X, Y ) = ( YX) q + YX- x q+1 . 

2. In 1996, Garcia and Stichtenoth [7] found a subtower of the first tower, defined 
by the recursive relation 

/(x,y) = y< + r-^^, 

which is also asymptotically optimal. Elkies in [2] showed that the above two 
towers are in fact Drinfeld modular towers, that is, they arise from Drinfeld 
modular curves by reduction. 

3. In [8] Garcia and Stichtenoth constructed two more towers over F 4 and F 9 , 
respectively. They were shown by Elkies [3] to come from the reduction mod 2 
of the elliptic modular curves (Xo(3 n )} and reduction mod 3 of the modular 
curves {JT 0 (2 n )}, respectively. Sole in [17] gave a slightly different proof of 
this fact using Jacobi quart ic identity. 

4. In his Allerton conference paper [3] in 1997, Elkies listed six families of elliptic 
modular curves {Xo(£ n )} for £ — 2, 3, 4, 5, 6, and { Xq(3 ■ 2 n )}, as well as two 
families of Shimura curves, showing that their induction mod p for primes p 
not dividing the level yield recursively defined asymptotically optimal towers 
over F p2 . We shall explain some of these families in the next section. It should 
be pointed out that the first two towers are wild towers, meaning that wild 
ramifications occur in the consecutive field extensions, while the towers in [4] 
are tame towers, that is, at most tame ramifications occurring in consecutive 
extensions. 




Modularity of Optimal Towers 



55 



5. With the help of computer search, in 2002 Li, Maharaj, and Stichtenoth 
[13] gave four new recursively defined asymptotically optimal towers over 
F 4 , F 9 , F 25 , F 49 , respectively. These are tame towers, and they are not subtow- 
ers of any previously known asymptotically optimal towers. Elkies showed 
that they are elliptic modular towers [4]. The recursive polynomials and the 
proofs will be discussed in later sections. 



6. 



The most recent asymptotically optimal tower is the one constructed by Bez- 
erra and Garcia in 2003. It is a subtower of the second tower with the recursive 
rational function 



Y - 1 

nx,Y) = -- 

The modularity of this tower is unknown. 



X 9 - 1 
X 



The tower T can be seen from geometric point of view as follows. Denote by Xi the 
smooth irreducible curve defined over F whose function field is F;. The increasing 
chain (3.1) means that geometrically we have a sequence of covering curves: 



X\ < — X 2 « — Xs - • ■ 

with X\ equal to the projective line P 1 over F, and X 2 is a curve in X\ x X\ 
defined by f(X , Y) = 0. Inductively, we see that X n is a curve in the product of n 
copies of Xi, namely, X\ x • • • x X±, such that a point (Pi, . . . , P n ) of the product 
X\ x • • • x X\ lies in X n if and only if (Pj > ^7 + 1 ) lies in X 2 for j = 1, . . . , n — 1. In 
other words, X n is obtained by iterating n — 1 times the correspondence from X\ 
to X\ given by X 2 . 



4. Elkies’ Conjecture 



Before explaining the underlying philosophy of Elkies’ conjecture, we recall some 
basic facts about elliptic modular curves. The group SL 2 (Z) acts on the Poincare 
upper-half plane S) by fractional linear transformations. Given a positive integer 
AT, we are interested in the congruence subgroup 

r 'o(N) := {(“ J) € SL 2 (Z) : c = 0 (mod N)}, 
and its subgroup 



ri(iv) 




E SL 2 (Z) : a = d = 1 



(mod AT), c = 0 (mod N)}. 



The quotients Yo(N) := Fo(N)\S) and Yi(N) = Ti(N)\9) are called modular 
curves; each has finitely many cusps. After adjoining the cusps, we obtain com- 
pactified modular curves Xo(N) and X\ (N), respectively. These are curves defined 
over Q. Both Yo(N) and Yi(N) are moduli spaces, each parametrizes equivalence 
classes of elliptic curves defined over C with certain level N structure. We explain 
Yq(N) in more detail. 




56 



Wen-Ching Winnie Li 



Consider the case N = £ n for an integer £ > 1 and integer n > 0. A point 2 in 
Yo(£ n ) represents the equivalence class of an elliptic curve E together with a cyclic 
subgroup Cen of order i n . We can also think of z as representing the isogeny from 
E to its quotient E/Cen , which is again an elliptic curve. Being cyclic, the group 
C^n contains a unique cyclic subgroup Cen- 1 . After n — 1 iterations, we obtain a 
descending sequence 

Cen DCen-i D • • • D C e . (4.1) 

In terms of isogenies, this yields a chain 

E = Eo — > E\ — > • • • — > E n , 
where Ei = E/Cei for i = 1 , . . . , n. 

On the other hand, the set of complex points on an elliptic curve over C may 
be identified with C divided by a rank two lattice. Another way to interpret z is to 
regard it as parametrizing the isogeny from the elliptic curve with lattice Z -f 2 Z 
to the the elliptic curve with lattice £~ n Z + 2 Z, which is equivalent to the lattice 
Z + £ n zZ. 

The advantage of viewing a point 2 as the chain (4.1) is that we can break 
it into n — 1 subchains of length 2 so that each subchain is a point in Fo(^ 2 ), and 
consequently we obtain a map 

7r n : Y 0 (n — » (Vo(^ 2 ))” -1 

by sending E = Eo — > Ei — > • • • — > E n to the point (Eo — > E\ — > E 2 , E\ 

E 2 — » £ 3 , • • • , E n - 2 — ^ E n - 1 — > E n ). In terms of points 2 in fj, the map 7 r n sends 
2 to the point ( 2 , £ 2 , . . . , £ n ~ 2 z) in n — 1 copies of lo(^ 2 ). We extend 7r n to a map 
from Xo(£ n ) to (Xo(£ 2 )) n ~ l . 

The Atkin-Lehner involution wen acts on Xo(£ n ) by sending 2 to or 
equivalently, it maps Eo —> Ei —>•••—> E n to E n —> E n - 1 — > • • • — ► E 0 . Here the 
map Ej — > Ej-i is the dual isogeny of Ej-\ —> Ej for j = 1, . . . , n. Note that there 
are two natural maps from Xo(£ 2 ) to Xq(£)\ the first one starts with the involution 
we 2 on Xo(£ 2 ), then followed by the natural projection proj from Xo(^ 2 ) to Xo(£), 
while the second one starts with proj from Xo(£ 2 ) to Xo(£), then followed by the 
involution we on Xo(£). Comparison of these two maps yields a description of the 
image of 7r n . More precisely, Elkies proved 

Theorem 4.1. [Elkies [3]] The map 

n n : X 0 (n — (*o(0) n_1 

given by 

(zjz,...,e n ~ 2 z) 

is an injection. Its image consists of points (Pi, . . . , P n -i) in (Xo(^ 2 ))” -1 satisfy- 
ing the relation 

proj o W p (Pj ) = w e oproj(P i+1 ) for j = 1, . . . , n - 2. 



(4.2) 




Modularity of Optimal Towers 



57 



When Xo(£ 2 ) has genus zero (and hence so does Xo(£)), we may parametrize 
the points on the curve by its Hauptmodul, that is, a generator of the function field 
of the curve. We compute the actions of wp and wt using the respective Haupt- 
modul, and further express the Hauptmodul of Xo(£) in terms of the Hauptmodul 
x\ of Xq{£ 2 ). In this way we obtain a recursive relation f(X,Y) describing the 
relation (4.2) as f(xi(z),xi(£z)) = 0 for all z eH. This works for £ — 2,3,4, 5. 



Example. Consider the case £ = 2. The Hauptmodul for X 0 (4) is 

U ( ~\ i i ^ \8 

h4{z) - 1 + 8 { ^) ] 



and the Hauptmodul for Xo(2) is 

M*> - = 8 

Here 



V( z ) x24 oCM^O + l) 2 






h^(z) — 1 



v (z) = e 2?riz / 24 [J (1 

m> 1 



imz'j 



is a modular function of weight 1/2. The recursive rational function is 
f(X,Y) = (X 2 - i)((I±l)2 _ i) _ i. 



A more interesting and complicated case is £ = 6. The modular curve Xo(36) 
has genus one. It is an elliptic curve with affine equation given by y 2 = x 3 + 1, 
hence points on Xq(36) may be described by pairs (x, y) satisfying the defining 
equation. The curve Xo(6) has genus 0. After going through the computations 
outlined above, Elkies [3] showed that the map n n identifies the points in Xo(6 n ) 
with the points ((xi,i/i), . . . , (x n _i,y n _i)) in A 0 (36) n-1 satisfying the conditions 

(*?-! - &)( z j - s) = 72 for j = 2, . . . , n - 1, (4.3) 



where 



_ ( y j + 3 



z j — ( 



) 2 - Xj - 2 



is the ^-coordinate of the point (2, 3) — (xj, yj) on Xo(36). In conclusion, the 
tower of the fields over F „2 



Fi c F 2 C F 3 C • • • 



obtained from Xo(6 n ) modulo a prime p ^ 2,3 is constructed by adjoining two 
variables at each stage, namely, Fj = Fj-i(xj,yj) for j > 2, which satisfy two 
relations 



and (4.3). 



y 2 3 =x) + l 




58 



Wen-Ching Winnie Li 



Based on the fact that all asymptotically optimal recursive towers known at 
the time were proved by him to arise from either elliptic modular curves, Shimura 
modular curves, or Drinfeld modular curves by reduction, Elkies conjectured in 
1997 that this should be a general phenomenon. 

Elkies’ Modularity Conjecture [3]. Every asymptotically optimal recursively de- 
fined tower over a finite field with square cardinality is modular, that is, the fields 
in the tower are the function fields of either elliptic modular curves, Shimura mod- 
ular curves, or Drinfeld modular curves by reduction. 

After the paper [13], Elkies includes towers arising from compactification by 
adding cusps of #/(A D r 0 fy n )), where A is some other congruence subgroup of 
PGL 2 (Q), modulo primes coprime to l. In his website [5], Elkies lists 15 (resp. 6) 
cases of elliptic modular towers arising from A D To(2) (resp. A n To(4)) with the 
recursive relation f(X,Y) a polynomial quadratic in X and Y. 



5. Numerical Evidence of Elkies’ Conjecture 

Using a computer program called KASH, in a joint work with Maharaj and 
Stichtenoth [13], we performed an extensive search for polynomials f(X,Y) of 
low degree over small finite fields which define good towers in general and asymp- 
totically optimal towers over finite fields of square cardinality in particular. To 
achieve this goal, we considered only towers satisfying the three conditions in the 
following theorem, which provides an explicit lower bound of how good such towers 
are. 



Theorem 5.1. [9] Let F\ C F 2 C • • • be a tower of function fields over ¥ q such that 

(i) All consecutive extensions F n +\ over F n are tame; 

(ii) The set 



R — {places v of F\ : v is ramified in F n for some n > 2} 
is finite; 

(iii) The set 

S — {places v of F\ : degv = 1 and v splits completely in all F n } 
is nonempty. 

Then 

A , x N(F n ) 2s 

A(q) > lim - > 



9 (F n ) - 2g(Fi) — 2 T r ’ 
where s is the cardinality of S, and r = ^Z veR degv. 



In the course of our search, all known asymptotically optimal recursive towers 
over small finite fields were recovered, but no good towers over a prime field were 
found. Based on this fact, we are tempted to make the following 




Modularity of Optimal Towers 



59 



Conjecture. No tame towers over prime fields satisfying conditions (i)-(iii) are 
recursively defined by a polynomial /(X, Y) of degree 2 or 3. 

In addition to the numerical evidence, there are some theoretical support to 
this conjecture. Lenstra in [11] proved that the construction of Garcia, Stichtenoth 
and Thomas [9] (for every finite field which is not prime) does not work over a prime 
field F p and /(X, Y) = Y 2 + aX 2 + bX over F p . In other words, such / confirms 
the conjecture above. Moreover, Maharaj, Stichtenoth and Wulftange [14] showed 
that the recursive polynomial /(X, Y) = Y 2 + aX 2 + bX over F p defines a tower 
over ¥ q satisfying the conditions (i)-(iii) above if and only ifp = 3,a = l,6^0 
and q is a square. Further, /(X, Y) = Y 3 H- aX 3 + bX 2 + cX over F p defines a 
tower satisfying (i)-(iii) if and only if p = 2, a = b = c = 1 and q is a square. Hence 
they provide further support to the above assertion. 

Of the conditions (i)-(iii) in Theorem 5.1, our experience indicates that the 
condition (iii) is more restrictive than (ii). 

As a result of our computer search, 4 new asymptotically optimal towers were 
discovered in [13], which were proved by Elkies [4] to be modular as an appendix 
to [13]. This provides a numerical evidence of Elkies’ modularity conjecture. We 
summarize the main results below. 

Theorem 5.2. [Li-Maharaj-Stichtenoth [13] and Elkies [4]] 

(1) The polynomials 

X 2 Y 3 + (X 3 + X 2 + X)Y 2 + (X + 1)Y + X 3 + X over F 4 
2XF 2 + (X 2 + X 4- 1)Y + X 2 + X + 2 over F 9 
(4X + 1)Y 2 + (X 2 + X + 2)Y + X + 3 over F 25 
(X 2 + 6)T 2 + XY + X 2 + 4 over F 49 
define recursive asymptotically optimal towers. 

(2) These towers are not subtowers of the known asymptotically optimal 

(3) These towers are new modular towers. More precisely, the nth curve 
tower is isomorphic with the elliptic modular curve associated with 
lowing congruence subqroup of PSL 2 (Z): 

ri(9)nr 0 (3 n+1 ) over¥ A 
IT (5) n ro(2 n ) over F 9 
r 1 (12)nr 0 (2 n+1 ) over F 25 
IT(5) nr 0 (2 n ) over F 25 . 

Several remarks are in order. The new towers, while they are not subtowers, 
are supertowers of previously known modular towers Xo(3 n+1 ) and X 0 (3 • 2 n+1 ), 
and of modular tower Xo(5 • 2 n ), which can be obtained by known methods. 

We note the following new features of the new towers: 

(A) Every previous recursive tower of elliptic modular curves is either (Xo(^ n Xo)} 
or a subtower of {X 0 (£ n X 0 )}; new towers require r 0 (^ n X 0 ) fl Ti(Nq). Because of 



towers, 
in each 
the fol- 




60 



Wen-Ching Winnie Li 



the involvement of Ti groups, in the proof of modularity of new towers, we cannot 
use the usual models of these curves, in which rational functions have rational 
Fourier expansions at the cusp at infinity; instead, one has to use Igusa’s model of 
the modular curve, which is a twist of the usual one. 

(B) In previous modular towers, as shown in the previous section, the method is to 
find a modular function aq (•) on the upper half-plane S) satisfying f(x i {z),x\ (£z)) = 
0 for all z 6 S), leading to the parametrization of the point {x \ , . . . , x n ) by modular 
functions {x\(z),xi{£z), . . . , x\(£ n ~ l z)). In the new towers, the identity takes the 
form f(xi(z),e(xi(£z))) = 0, where e is a fractional linear transformation such 
that 

f(X,Y) = 0 if and only if f(e{X),e(Y)) = 0. 

Thus a point on a new tower has coordinates 

(*i (*), (**)), e 2 (x! ifz)), £ n ~ 1 (x 1 (r~ x z))). 

In each case the cyclic group generated by e gives the action of Tq{Nq)/T\{Nq) 
(which is isomorphic to the abelian group (Z/AT 0 Z) X /{±1}) on the £i-line of 
Xi(No). 

To give a flavor of Elkies’ proof of the modularity of new towers, we demon- 
strate the case of the tower T over Fg defined by the recursive polynomial 

f(X , Y) = 2 XY 2 + (I 2 + I + l)r + I 2 +I + 2. 

Note that each F n+ i is a quadratic extension of F n for n > 1. The general strategy 
is to simplify the tower by successively dividing out symmetries until the tower 
becomes a recognizable modular tower. 

The starting point is to find a symmetry on the curve X 2 defined by f(X, Y) = 
0. We get some clue by looking at the set 5 of F-rational points on the projective 
line which splits completely in all fields F n in the tower. To find 5, search for 
a maximal set of places of degree one in F\ = F(X) which splits completely in 
F 2 such that the occurring degree one places in F 2 are the same as those in F\ 
we started with. This then repeats itself as we go up through all extensions in 
the tower. Consequently the starting set is the set 5 we look for. Denote by uj a 
primitive root of Fg ; it satisfies uj 2 — uj — 1 = 0. By straightforward computations, 
we find the following splitting information: 



place in F\ 


place in F 2 


0 


00 , 1 


1 


-1, 1 


00 


00 , —1 


-1 


—uj, —uj 3 


—UJ 


0, —uj 3 


-u, 3 


0, —uj 



Therefore we obtain 



S — {0, 1 , 00 , — 1 , —uj, —a; 3 }. 




Modularity of Optimal Towers 



61 



To S we attach a directed graph, called the graph of splitting points, with 
vertex set S and edge set given by the table above, that is, there is an out-edge 
from vertex u to vertices v and v' if and only if the place u of F\ splits into places 
v and v' in F 2 . Each vertex has two out-edges and two in-edges. Note that there 
is a loop at the vertex 1 and vertex oo, respectively; a loop counts as an in-edge 
and an out-edge. This graph helps us visualize the following symmetry on S : 

0 0, — 1 «-* —1, 1 oo, —uj —a; 3 . 



The fractional linear transformation 



e(X) = 



X 



X-l 



has order two and maps the symmetrical points to each other. Since an F-rational 
involution on X 2 must preserve the symmetry on 5, this suggests that e is the 
desired involution on X 2 , and inductively on all X n . Indeed this can be verified 
by checking 



f(X, Y) = 0 if and only if f(e(X),e(Y)) = 0. 



Setting U = X + e(X) and V = Y + e(Y), we obtain a quotient tower Q with 
recursive defining polynomial 

g(U, V) = UV 2 - U 2 V + (U + l) 2 . 

Proceed as before. To find a symmetry for the tower Q, we have to figure out its 
graph of splitting points, which arises from that of tower T under U = X 4- e(X). 
It has vertices 0, oo, 1, -1 and out-edges 1 — > — 1, — 1 — > —1, -1 — > 0, 0 — > oo, oo — > 
oo,oo — ► 1. Observe the symmetry 

0 < — > 1 , oo —1, 

which suggests the involution jjl(U) = After verifying 

g(U, V) = 0 if and only if g{^{U)^{V)) = 0, 

we conclude that fi is the desired involution on the Q tower. Under W = U + 
and Z = V + //(U), we obtain a quotient tower H defined by the recursive 
polynomial 

h(W, Z) = (W - 1 )Z 2 + (W — W 2 )Z + W 2 + W. 

So far, we have constructed three towers: tower T is a two-fold cover of tower 
£/, which is a two- fold cover of tower l~i. We proceed to draw connection with 
modular towers from bottom up. 




62 



Wen-Ching Winnie Li 



Theorem 5.3. 

(i) The TL tower is isomorphic to the tower from the family of modular curves 
{Xo(5 • 2 n )/w^}. Here is the Atkin-Lehner involution at 5. 

(ii) The Q tower is isomorphic to the tower from the family of modular curves 
{Xo(5 • 2 n )} with isomorphism given by (U, V) i— > (a(U),a(V)) where 

<*(u) = (j -\)u - r and /2 = _1 in F9- 

(iii) The T tower is isomorphic to the tower from the family of curves {Xo(5 • 
2 n ) x x 0 (5) -X’i(S)} with the isomorphism given by (X,Y) h-> (0(X), (3(Y)) 
where 

p( x ) = m = p(-i) = /, 

and a, b, being roots of x 2 — (7 + l)x + 7 + 1 = 0, lie in a quadratic extension 
of F 9 . 



Notice that the fiber product in case (iii) is nothing but the curve of the 
group To(5 • 2 n ) n Ti(5) = Ti(5) fl ro(2 n ), as stated in Theorem 5.2. In each case 
the modular tower consists of the function fields over Fg of the reduction mod 3 
of the corresponding modular curves. 

It should be pointed out that the isomorphisms of the first two cases are 
over Fg, while the last isomorphism is over a quadratic extension of Fg if the 
usual model on modular curves is used. However, if the Igusa model, which is a 
quadratic twist of the usual model, is used for the modular curves in case (iii), 
then the isomorphism in (iii) is again over Fg. 

We sketch the proof of Theorem 5.3. Start with the modular curve Xo(10), 
which has genus zero and Hauptmodul 



- 7/(2*) , <7(5 z) , 5 

( } V (z) ( r ? (10 2 ) ) • 

Its quotient Xq(10)/w5 has genus zero and Hauptmodul 



H(z) = ( )4 = G 2 -4G 

V ’ +2z)rKlO+ G + 1 

Write Gi(z) for G(2 l z) and Hi (z) for H(2 l z) for brevity. It is not hard to check 
that the Gi s and Tf^’s satisfy the following recursive relations respectively. In other 
words, the modular towers in (ii) and (i) are both recursive towers with recursive 
relation 



Gi + 1 = Gi(GiGi- i-i — 2Gi+i — 4), 

Hf +1 = Hi(HiHi+i + + 16), 

respectively. Compare the H tower with the tower from {X 0 (5 • 2 n )/w$}. Observe 
that an isomorphism of recursive towers should preserve fixed points of the recur- 
sive relations. This would give us a clue about the isomorphism. Indeed, solving 




Modularity of Optimal Towers 



63 



h(W , W) = 0 against Hi = one is led to the isomorphism 

W Z 

A direct computation shows that this map brings the recursive relation on Hi, H i+ 1 
to the recursive relation on W, Z. This proves (i). 

For Q tower, Gi = G*+i has 4 simple roots, while g(U,U) =0 has two double 
roots at U = — 1 and U = oo. Use the equivalent form of the Q tower by applying 
the involution g, to only one variable. This yields an isomorphic tower Q' with new 
recursive relation 



g'{U, V) = (l- U 2 )V 2 - (U 2 + U + 1)V + 1. 

The Q f tower can now be identified with the {Ao(5*2 n )} tower by taking (£?*, G*+ 1 ) 
= (a(U),a(V)) with 

(U) = jj U where 1 2 = -1 in F 9 . 



This proves (ii). 

Finally we prove (iii). The bottom curve of {Xo(5-2 n ) Xx 0 (5 ) A"i( 5)} is Xi(10), 
which has genus zero with Hauptmodul given by 

oo 

G'{z) = e~ 27riz [pi - e 27rin2 ) c ", 

71 = 1 

where 

! — 1 ifn = =bl,±2 (mod 10), 

1 if n = ±3, ±4 (mod 10), 

0 if 5| n. 

Further, G = G' — ^7, or equivalently, G' 2 — GG' — 1=0. This implies that 
the double cover Xi(10) over Xo(10) is ramified at G 2 + 4 = 0. Reducing mod 3 
and regard the reduced curves as over F9, the two ramified points are at G = I 
and G = —I in Fg. Notice that a(0) = I, a( 1) = —I, and U — 0 and U — 1 
are the two branch points of the double cover of the t/-line by the A-line given 
by U = X + e{X). One checks that the isomorphism (X, Y) i-> ((3(X), (3(Y)) as 
described in (iii) lifts the isomorphism (U,V) 1— ► (a(U),a(V)) given in (ii). This 
proves (iii). 



Acknowledgment 

The research of this work is supported in part by the NS A grant MDA904-03-1- 
0069. 




64 



Wen-Ching Winnie Li 



References 

[1] V.G. Drinfel’d and S.G. Vladut, Number of points of an algebraic curve. Funct. Anal. 
17 (1983), 53-54. 

[2] N.D. Elkies, Explicit towers of Drinfeld modular curves. Proceedings of the 3rd Eu- 
ropean Congress of Mathematics, Barcelona, 7/2000. 

[3] N.D. Elkies, Explicit modular towers , Proceedings of the Thirty-Fifth Annual Aller- 
ton Conference on Communication, Control and Computing, T. Basar and A. Vardy, 
eds. (1997), 23-32. 

[4] N.D. Elkies, Appendix to New optimal tame towers of function fields over small finite 
fields by W.-C. W. Li, H. Maharaj, and H. Stichtenoth, Lecture Notes in Computer 
Science 2369 C.Fieker and D.R.Kohel, eds. (2002), Springer- Verlag, Berlin, 384-389. 

[5] N.D. Elkies, http://abel.math.harvard.edu/ elkies/compnt.html. 

[6] A. Garcia and H. Stichtenoth, A tower of Artin-Schreier extensions of function fields 
attaining the Drinfeld- Vladut bound, Invent. Math. 121 (1995), 211-222. 

[7] A. Garcia and H. Stichtenoth, On the asymptotic behaviour of some towers of func- 
tion fields over finite fields, J. Number Theory 61 (1996), 248-273. 

[8] A. Garcia and H. Stichtenoth, Asymptotically good towers of function fields over 
finite fields, C.R. Acad. Sci. Paris Ser. I Math. 322 (1996), 1067-1070. 

[9] A. Garcia, H. Stichtenoth and M. Thomas, On towers and composita of towers of 
function fields over finite fields. Finite Fields Appl. 3 (1997), no. 3, 257-274. 

[10] Y. Ihara, Some remarks on the number of rational points of algebraic curves over 
finite fields, J. Fac. Sci. Univ. Tokyo Sect. IA Math. 28 (1981), 721-724. 

[11] H.W. Lenstra, Jr., On a Problem of Garcia, Stichtenoth, and Thomas, Finite Fields 
Appl. 8, 1-5 (2001). 

[12] W.-C.W. Li and H. Maharaj, Coverings of curves with asymptotically many rational 
points, J. Number Theory 96 (2002), 232-256. 

[13] W.-C.W. Li, H. Maharaj, and H. Stichtenoth, New optimal tame towers of function 
fields over small finite fields, Lecture Notes in Computer Science 2369 C. Fieker and 
D.R. Kohel, eds. (2002), Springer- Verlag, Berlin, 372-389 . 

[14] H. Maharaj, H. Stichtenoth, and J. Wulftange, On a problem of Garcia, Stichtenoth, 
and Thomas II, 2003, preprint. 

[15] H. Niederreiter and C.P. Xing, Towers of global function fields with asymptotically 
many rational places and an improvement on the Gilbert- Varshamov bound. Math. 
Nachr. 195 (1998), 171-186. 

[16] J.-P. Serre, Rational Points on Curves over Finite Fields, Lecture Notes, Harvard 
University, 1985. 

[17] P. Sole, Towers of function fields and iterated means , IEEE Trans. Inform. Theory 
46 (2000), 1532-1535. 

[18] A. Temkine, Hilbert class field towers of function fields over finite fields and lower 
bounds for A(q), J. Number Theory 87 (2001), 189-210. 

[19] M.A. Tsfasman, S.G. Vladut and T. Zink, Modular curves, Shimura curves and 
Goppa codes better than the Varshamov- Gilbert bound, Math. Nachr. 109 (1982), 
21-28. 




Modularity of Optimal Towers 



65 



[20] T. Zink, Degeneration of Shimura surfaces and a problem in coding theory, in Fun- 
damentals of Computation Theory, L. Budach (ed.), Lecture Notes in Computer 
Science, Vol. 199 , Springer, Berlin, p. 503-511, 1985. 



Wen-Ching Winnie Li 
Department of Mathematics 
Pennsylvania State University 
University Park, PA 16802, USA 
e-mail: wli@math.psu.edu 




Progress in Computer Science and Applied Logic, Vol. 23, 67-83 
© 2004 Birkhauser Verlag Basel/Switzerland 



A New Correlation Attack on LFSR Sequences 
with High Error Tolerance 

Peizhong Lu and Lianzhen Huang 



Abstract. Let u — (ui, U 2 , . . . , u n ) be N bits of a linear feedback shift reg- 
ister (LFSR) sequence with L the degree of the feedback polynomial. Let 
z — (zi, Z 2 , ... , zn) be N bits of observed sequence such that P(zi = m) = 
1/2 + 5 where 0 < 5 < This paper presents a new efficient correlation at- 
tack on stream ciphers, which is equivalent to solve the problem of recovering 
the LFSR’s initial state (ui,U 2 , . . . ,ul) from the observed output sequence 
z. We consider the problem as a decoding problem for a linear [N, L] code. 
Our new approach has at least three advantages. Firstly, the new algorithm 
constructs much more independent parity check equations which results in 
significant decrease both of the decoding errors and of the required length 
N of the observed sequence. Secondly, by the combination of statistical test 
and repeatedly using of One-Step decoding algorithm, our novel scheme pro- 
vides better performance and lower complexity than other reported methods. 
Thirdly, we find a new formula to describe the relationship between the ten- 
dency of attack performance, the weight w of parity check equations, the noise 
level <5, and N. 

Mathematics Subject Classification (2000). Primary 94Z55; Secondary 94A60. 
Keywords. Stream cipher, correlation attack, statistical test. 



1. Introduction 

In the design of stream cipher system, the initial states of some linear feedback 
shift registers (LFSR) are commonly used as secrete keys. The running keystreams 
are generated by some nonlinear combination of several LFSR sequences. 

There are several classes of attacks against binary stream ciphers. One im- 
portant class of attacks on LFSR-based stream ciphers is fast correlation attacks 
[1, 2, 3, 4, 5, 6, 7]. Siegenthaler [8] showed that it can happen that the observed 
output sequence is correlated to the output of a particular target LFSR. Thus it is 
reasonable to try to apply a so-called divide-and-conquer attack, i.e., try to restore 
the initial state of the target LFSR independently of the other unknown key bits. 




68 



Peizhong Lu and Lianzhen Huang 



The basic ideas of all reported fast correlation attacks consider the crypto- 
graphic problem as a suitable decoding one, namely one may consider the output 
of the target LFSR to have passed through an observation channel. The channel 
is modelled by the Binary Symmetric Channel (BSC), with some error probability 
p = \ - S, for 5 > 0. 

Let u = (^ 1 ,^ 2 , . • • , txjv) be N bits of a linear feedback shift register (LFSR) 
sequence with L the degree of the feedback polynomial f(x). Then u is considered 
as a codeword of a binary linear [TV, L] block code. Let z = (zi, 22 ? • • • , zjv) be N 
bits of observed sequence such that P(zi = Ui) = \ + 5 where 0 < S < 

The correlation attack is a decoding problem of restoring the LFSR’s initial 
state u = (u 1 , 1 x 2 , • • • , un) from the observed output sequence 2 . 

Meier and Staffelbach [7] find a very efficient way of iteratively decoding the 
[TV, L\ code when the feedback polynomial f(x) has low weight. 

Methods for fast correlation attacks for general feedback polynomials have 
been proposed [5]. Johansson and Jonsson [5] [6] suggest a new fast correlation 
attack based on convolutional codes. They can be applied to arbitrary LFSR feed- 
back polynomials. The Viterbi algorithm with memory orders B < 18 is used as 
the final decoding method. The performance of the algorithm is good. But the 
degree of the feedback polynomial should be less than 64 because of the limit of 
B < 18 in Viterbi algorithm. 

Recently, there are some nice algorithms [1] [3] for fast correlation attacks 
based on linear binary block codes, which can be applied to arbitrary LFSR feed- 
back polynomials. 

Mihaljevic, Fossorier and Imai [3] present two algorithms for the fast correla- 
tion attacks. These decoding procedures offer good trade-offs between the required 
sample length, overall complexity and performance. Chepyzhov, Johansson and 
Smeets [1] present a new simple algorithm for fast correlation attacks on stream 
ciphers. They associate with the target LFSR another binary linear [ri 2 ,/c]-code 
with k < L. The k information symbols of this code may coincide with the first k 
symbols of the initial state of the LFSR we want to recover. The codeword of this 
second code is considered to have passed through another BSC2 with a double 
“noise level” P 2 = 2p(l — p) > p. If the length of the new code can be chosen at 
least 722 = \k/C(p 2 )], then the decoding of this code leads to the recovery of the 
first k symbols in the initial state of the LFSR. Since the new code has dimension 
&, the decoding complexity is decreased from 0(2 L x L/C(P)) to 0(2 k x k/C(p 2 )), 
where C(p) = 1 — H(p) = 1 — (— plog 2 p— (1 —p) log 2 (l —p) is the channel capacity 
of BSC. 

In this paper, we present two new algorithms, Algorithm A and B, for fast 
correlation attacks. The two algorithms do not depend on the weight of the LFSR 
feedback polynomial. Although we are influenced by [1] and [3], our algorithms 
improve the construction of parity check sets such that the number of parity check 
equations we construct is L — B times more than that in [1] and [3], which results 
in significant decrease both of the decoding errors and of the number of bits of the 
received degraded LFSR sequence. 




A New Correlation Attack on LFSR Sequences 



69 



We first define some random variables on the number of passed-parity-check 
equations. Then we propose a statistical test based on linear block codes which is 
a main step of our decoding algorithm. Our novel algorithm provides a remark- 
ably better performance and lower complexity than other reported methods by 
repeatedly using One-Step decoding algorithm. 

Our new approach is compared with recently proposed improved fast corre- 
lation attacks in [1] and [3] based on binary linear block codes. Plentiful experi- 
mental results show that our new algorithm yields better performance and lower 
complexity than the best algorithm reported up-to-now. 

Some new interesting theoretical results are also derived in this paper. We 
find a new formula to describe the relationship between the tendency of attack 
performance, the weight w of parity check equations, the noise level p, and the 
required length N of the observed sequence, namely the performance of our cor- 
relation attack by using (w + l)-weight parity check equations is better than the 
one by only using w- weight equations if and only if 

( i-2p)J^->l. 

V w + 1 

The paper is organized as follows. Section 2 introduces some concepts used in 
correlation attack. Section 3 defines L—B+l sets of parity check equations. Section 
4 discusses some random variables of the number of passed-parity-check equations 
and their probability distributions. Section 5 presents our new fast correlation 
attacks. Comparisons between the recently reported best fast correlation attacks 
and our proposed algorithms are given in Section 6. Finally, the results of this 
paper are summarized in Section 7. 



2. Concepts and Problem Descriptions 

Let z = (zi, Z 2 , • • • , zn) be the observed keystream sequence which is regarded 
as the received channel output. Let u — (ui,i/ 2 , . . . ,%) be the LFSR sequence 
which is considered as a codeword from an [AT, L\ linear block code C. The code C 
is composed of all the 2 L sequences generated by an LFSR with a feedback poly- 
nomial of L degree. Due to the correlation between and z %1 we can consider each 
zi as the output of the binary symmetric channel, BSC, when Ui was transmitted. 
The correlation between Ui and z z is described by the following probability: 

P(zi = Ui) = 1 - p = 1/2 -f £ 

where p < 0.5 and e > 0. 

The so-called fast correlation attack on a particular LFSR is to find the 
initial state (ui, ^ 2 , • • • , ul) of the LFSR sequence u by using z and the correlation 
probability P(zi = Ui) = 1 - p with complexity of order 0( 2 aL ) with respect to 
some a < 1. Thus the problem of finding a fast correlation attack is equivalent to 
the problem of finding a fast decoding algorithm of the linear [AT, L\ block code 




70 



Peizhong Lu and Lianzhen Huang 



C over a BSC with crossover probability p, where (u\, U 2 , • • • , ul) is called the 
information word and u^+i, . . . , ujy are the parity check symbols. 

It is worth to notice that in the theory of correlation attack, the typical 
values of p are closed to 1/2. For example, p = 0.4. However in the theory of error- 
correcting codes, the typical values of p are much smaller, for example, p = 0.05. 

From the coding theory, we know that, to realize unique decoding for a code- 
word passing a BSC, the length N of codeword must be not less than iVo = c^p) • 
Usually, fast correlation attacks perform better when N No. 

Similar as other algorithms reported for fast attacks, our algorithms have a 
precomputing procedure for constructing independent parity check equations in 
off-line. When we need to find the initial states of original LFSR sequence x after 
we received keystream sequence z, our new algorithm will decode z according to 
the parity check equations in disk in on-line. 

3. Sets of Parity Check Equations 

We define two types of sets of parity check equations. The first type has L — B sets 
Q,i,i = B + 1, . . . , L, which correspond to the zth information symbol. The second 
type has one set Q*. 

Let Glfsr = ( 9i 92 • • • 9n ) be the generating matrix of the [N,L] 



linear code C, where gi is a L-dimensional column vector. Let u = (rq, U 2 , . . . , un) 
be a codeword of C. We can see that 

Ui = Uo9ui = 1,2, — , -ZV, (3.1) 

where Uo is the initial state of the LFSR for the sequence. Let z = (#*, Z 2 , • • • , 2jv) 
be N bits of observed sequence such that P(z* = Ui) = \ + 8 = l— p where 
0 < 6 < We have the following parity-check equations corresponding to (3.1) 

Zi®Zo9i,i - 1,2,..., TV, (3.2) 



where Z 0 = (zi, z 2 , . . . , z^). ® is the sum of mod 2. If z* ® Zog z — 0 for some i, we 
call it the passed-parity- check equation. 

Definition 3.1. For arbitrary B < i < L, and given a weight w, the set of parity 
check equations associated with the ith information symbol is the set Qi consisting 
of the following parity- check equations 

( z h ® Z ogjl ) © (zj 2 © Z 0 g J2 ) © • • • © (zj w © Z ogj J 

where 1 < ji , j *2 ? • • • ,j w < N and (gj 1 © gj 2 © * • • © 9j w ) has arbitrary values in the 

first B coordinates , value one at the ith coordinate, and value zero in all the other 
L — B — 1 coordinates. 

Definition 3.2. The set Q* consists of the following parity- check equations 

(zji © Z 0 g 3l ) © (z j2 © Z 0 g j2 ) © • • • © (z jw © Z 0 gj w ) 

where 1 < ji, j 2 , . . . ,j w < N and {g 3l © gj 2 © * * * © gj w ) has arbitrary values in the 

first B coordinates, value zero in all the other L — B coordinates. 




A New Correlation Attack on LFSR Sequences 



71 



It is not difficult to see that |fi*| « |f^| for i = B + 1, . . . , L. Let m — |f2*|. 
For 1 < j < ra, the jth parity check equation in Qi has the following relation: 

Cij — (zj 1 0 Z$gj 1 ) 0 (zj 2 0 Zogj 2 ) 0 • • • 0 (zj w 0 Zogj w ) 

= Z o (9h © 9n © * • • © 0 ; J © * © ELi ^ ( 3 - 3 ) 

= z o(9ji © 9h © * ' * © 9j w ) © hj 

where Z' 0 = (z u z 2 , . . . , z B , 0, . . . , 0), b tj = Zi 0 ELi ^ • 

Similarly, by Definition 2 , the value of the jth equation in can be expressed 
as the following: 

© (^ji © z o9ji) © (zj2 © z o9j2 ) © * * ■ © © ^o9j w ) 

= Z o(9ji © © * * ’ © £ 7 ™ ) © Ylk=l Z 3k ( 3 -4) 

= Z o(9h © 9h © * * * © 9jJ © 

where bj = Ylk=i z jk • 

We outline the precomputation of Q,* and fj* in the following algorithm. 

Precomputing Algorithm: 

Input: Integers B,L,N,w and the generator matrix Glfsr- 
Processing steps: For arbitrary w columns g n ,g j2 ,..., g jw of G LFS r , if (g 0l ©fe © 
• • • © gj w ) has arbitrary values in the first B coordinates, and value one at the ith 
coordinate, and value zero in all other coordinates, then the vector (i\, i 2 , . . . , i w ) 
and the vector of the first B coordinates of (gj 1 0 0 • • • 0 gj w ) are stored as a 

record in the set D*. 

If ( 9j x © gj 2 © * * • © 9j w ) has arbitrary values in the first B coordinates, and 
value zero in all the other coordinates, then the vector («i, « 2 , • • • , i w ) and the 
vector of the first B coordinates of (g jl 0 gj 2 © • • • 0 gj w ) are stored as a record in 
the set 0 *. 

Output: The sets of parity check equations Q* and fi*, for i = B + 1 , . . . , L. 

In practice, for the case w — 2, the parity check equations can be found in 
a very simple way as follows. We simply put each column of Glfsr into different 
“buckets” , according to the value of the last L — B positions. Each pair of columns 
in each bucket will provide one parity check equation in fl*. And for any two 
buckets, if only the ith value is different in last L — B positions, then each pair 
of these different buckets will provide us with one parity check equation in fi*, for 
i = B + 1 , . . . , L. For w > 3, we store the columns in the same way as for w = 2. 
To find a parity check equation, we run through all w — 1 columns, add them, and 
look in the bucket corresponding to the values of last L — B bits. Thus all the 
parity-check equations in Q* and f ^ can be found. 

Lemma 3.3. [ 6 ] A tight approximation about the expected number of\Q*\ or\Cli\ is 




72 



Peizhong Lu and Lianzhen Huang 



As an illustration, note that for N = 40000, L = 40, w = 2, and B — 18, 19, 
20, 21, 22, Lemma 3.3 yields that the expected cardinality m is equal to 190, 380, 
761, 1522, 3045. 

Lemma 3.3 implies that the expected cardinalities of the parity-check sets 
specified by Definitions 3.1 and 3.2 do not depend on the LFSR feedback polyno- 
mial, and particularly on its weight, since the expected cardinalities of |£2*| and 
\Q*\ are the same. For convenience, we assume that all the sets |Q;| and |D*| have 
the identical cardinalities, denoted by m in the sequel. 



4. The Random Variables of the Number of 
Passed-Parity-Check Equations 

Assume (aq, x 2 , • • • , xb) is the first B information bits of a codeword x in the 
linear [N, L]-code. By (3.3) and (3.4), we have that 

B 

c 'i ] ='}l a h x k®K (4.1) 

k= 1 



B 

c'j = Yl a ^ Xk ® b i 

k— 1 



(4.2) 



Let (ui,u 2 , . • . ,itjv) be the target LFSR sequence. We have the following 
lemmas. 



Lemma 4.1. Let p = \ - S = P(z n ± u n ), p f = \ - e = P(c' ^ 0), p\ — \ - Si = 
P(c'i. ± 0). If (iq,u 2 , . . . ,ub) = (xi,X 2 , • • • jXb), then e = 2 W ~ 1 S W and |^| = e. 



Proof. The first part is proved in [5]. We now prove the second part. Since 

. w B 

p r = 2 - e = p ( c 'j / °) = p (L2 z h ± H a j k u k), 

k— 1 k= 1 

it implies that p' is the probability of the equations (4.2) with weight w being not 
a passed-parity-check equation. Similarly, since 

^ w B 

Pi ~ 2 ~ Si ~ 7 ^ 0) = P(^^ Z Jk 7 ^ Zi(& a j k Uk), 

k= 1 fc=l 



and if = m then 



/e=l 






7 ^ ^2 



B 

■X> 

fc=l 






Wfc), 



namely, p' is the probability of the equations (4.1) with weight w being not a 
passed-parity-check equation. Therefore p\ — p' and Si — e. 




A New Correlation Attack on LFSR Sequences 



73 



If Zi 7 ^ Ui then 

( w B \ / w B 

=P['52 z j k =Ui®^2a jk u k 

k=l k= 1 / \k= 1 k= 1 

namely, p\ is the probability of the equations (4.1) with weight w being a passed- 
parity-check equation. Thus p\ = 1 - p' and Si = —e. □ 

Lemma 4.2. Let p = \ - S = P(z n ± u n ), p' = \ - £ = P(c' ^ 0), p\ = 
= P(c'. ^ 0). Suppose that aj 1 , a j2 , . . . , a jB are pairwise independent random 
variables with P(aj k = 0 ) = P(a,j k = 1) = \ for an arbitrary integer k (1 < k < 
B). //(wi,m 2 ,...,ub) ^ (xi,x 2 ,...,xb), then Si = e = 0. 



Proof. If (ai, u 2 , . . . , a#) ^ (#i, x 2 , . . . , xb) then there exists at least one integer i 
such that Ui ^ . Let t be an integer such that u t ^ x t . Without loss of generality, 

let x\ ^ ui, . . . , x t 7 ^ and Xj = Uj for j = t + 1, . . . , B. Thus 

1 / W B \ 

p’ = --£ = P(c' ^0)=? ^^=^ a jk u k )P{a h + • • • + a jt = 1 

\fc=l fc=l / 

+ -P | ^ a 3k U k)P( a ji + * ’ * + a jt = 0 | 

\fc=l fc=l / 

Since random variables a 3l , a j2 , . . . , a jB are pairwise independent, and 
P(«j fc = 0) = P(a Jk = 1) = i 

for k (1 < k < B), then 

P( a 3i 4 I" a 3t = 1) = P(cLjx H b a jt = 0) = 1/2 



and 

p' = ( 1/2 + e) x 1/2 + ( 1/2 - e) x 1/2 = 1 / 2 . 
Therefore e = 0. Similarly £i = 0. 



□ 



Remark 4.3. aj 1 , a j2 , . . . , a 3B are values in the first B coordinates of the sum of 
some w columns of G lfsr • By experiment results, we can say a 3l , a j2 , . . . , a 3B are 
pairwise independent, and P(aj k = 0) = P(aj k = 1 ) = 

Let S and Si be defined by the following equations. 

m 

S = E( c i ® !)’ ( 43 ) 

3 = 1 
m 

= (44) 
3 = 1 

Thus S and Si are the numbers of passed-parity-check equations in and Qi 
respectively. 




74 



Peizhong Lu and Lianzhen Huang 



Theorem 4.4. 

(i) If (ui,W 2 ,...,ub) = (x\,X 2 , ...,x#), then S has binomial distribution 
L?(ra, 1/2+e), and Si has binomial distribution B(m, 1/2+6“) or B(m, 1 / 2 — e). 

(ii) If (ui,U 2 , • • . ,ub) 7 ^ (xi, ^ 2 , . . • , xb), i/ien both S and Si have binomial dis- 
tribution B(m , 1 / 2 ). 



Proo/. Let a 3k be the kth. value of the vector {g 3l 0 gj 2 0 • • • ® g 3w ) in the jth 
parity-check equation in sets fi*, where i — P + 1 , . . . , L, k = 1,2,. . . , B. Since 
ji,j 2 , • • • , j w are selected randomly from the set {1,2,..., TV}, thus (g 3l 0 g j2 0 
• • • 0 g 3w ) can be regarded as a random variable on GF( 2 L ). Therefore we have 
P(cij k — 0 ) = P{a 3k = 1 ) = 1 and a 3l , a j2 , . . . , a JB are pairwise independent. 

Suppose that (ui, U 2 , . . . , ub) = (xi, x 2 , . . . ,xb). By Lemma 4.1, P(c' = 0) = 
1/2 + £, and 

+ * = 0) = P{c[ 3 = 0)P(z 2 = 0) + P(c' j = l)P(z* = 1). 

For a given integer z, B + 1 < i < L, Zi is a constant. If = 1, then P(c' -\- z t — 
0 ) = P(c'. = 1 ) = 1/2 — er*. If 2 * = 0, then PfcJ. + 2 * = 0) = P(c' b . = 0) = 1/2 + e*. 
Thus 5 has binomial distribution B(ra, 1/2 + e) and 5* has binomial distribution 
B(ra, 1/2 + e) or B(m , 1/2 — e). 

When (ui, zz 2 , . . . ,ub) / (xi, X 2 , . . . , xg), by Lemma 4.2, we get e* = e = 0. 
Therefore and 5* have binomial distribution B(ra, 1 / 2 ). □ 



Lemma 4.5. (Demoivre-Lapalace central limit theorem [10]) Suppose p (0 < p < 1) 
is the probability of success on each trial in n Bernoulli trials , £ n is the number of 
successes , then (£ n ~ B(n,p)),and when n — > oo, 



£n ~ np 

y/npq 



~ 7V(0,1) 



ie., 



lim P(- 



n-np 

V npq 



< x) 



=— r 

\/27T 7-1 



Corollary 4.6. Lei £ 



S-0 









2 — S-|- 



2 dt. 



l £ 2 . Lei m be sufficiently 



large. Then: 

(i) // (zzi, iz 2 , . . . , tz#) 7^ (xi, X2, . . . , xb ), then rj has chi-square distribution 
X 2 (L — B + 1) with L — B + 1 degrees of freedom. The expectation E(rj) = 
L — B + 1 . Here we denote r\ as r\\. 

(ii) If [u\ , U 2 , . . . , ub ) — (xi , x 2 , . . . , X# ) , i/ien the expectation E(rj) — (L — B + 
1)(1 — 4e 2 + 4me 2 ). We denote this rj as rj 2 - 



Proof. When (u\, u 2 , . . . ,ub) ^ (xi, x 2 , . . . , ##), by Theorem 4.4, both 5 and Si 
have binomial distribution B(ra, 1/2). Because m is large enough, by Lemma 4.5, 
both £ and £ 2 have distribution AT(0, 1). Since the parity check equations in and 
fl* are constructed independently, we can regard £ and £* as independent random 
variables. By the definition of chi-square distribution, we have the conclusion (i). 




A New Correlation Attack on LFSR Sequences 



75 



When (u\ , U21 . . . ,ub) = (xi,£2> . . . ,£#), we know 



Clearly, we have 



£fa) = £7« 2 )+ ^ £fe 2 )- 

i=B + 1 



4 

£(S 2 ) - m£(5) + 



D(5) + (£(S)) 2 - mE(S) + 



By Theorem 4.4, 5 has distribution B(m, 1/2 + e), and thus 



and 



£)(5) =m(i +s)(i -e) 



£(S) = m(-+ £ ). 



Therefore we have 

E{£ 2 ) = l-4e 2 + 4 me 2 . 

Similarly, E((f) = E(£ 2 ) = 1 - 4e 2 + 4 me 2 . Thus 

E(rj) = (L-B + 1)(1 - 4e 2 + 4 me 2 ). 



□ 



By Corollary 4.6, we know that when e is a constant and the cardinality m of 
the parity check equation sets is large enough, random variable 771 has distribution 
X 2 {L — B + 1) which is irrelevant to m. But the expectation of 772 linearly increases 
with TTi. Therefore when m is large, the distinction between 772 and 771 is obvious. 
Thus the distinction between 772 and rji can be used to determine whether the 
hypothesis (ui, U2, • • • , ub) = (xi, £2, . . . , xb) is correct or not. Clearly the larger 
£(772) is, the better the performance becomes. However if e is quite small or m is 
comparatively small, this distinction is not credible. 

In practice, we can find a critical value T\ as a threshold such that ^(771 > 
Ti) = d. It means that the probability of correctly judging (#1, X2, . . . , xb) = 
(7/1, 7/2, . . . , ub) is 1 — d. We call 1 — d the distinguishable probability between 771 and 
772. Hence if there are 2 B possibilities to be exhaustively searched for information 
bits (#i, £2, • • • , #b), the number of the remaining possibilities which need to be 
further judged is 2 B d. 

To intuitively understand the distinguishability of rji and 772 on statistic, we 
give the following experimental data in Table 1 with N = 40000, L = 40, w = 2, 
different noise ratio p, and different B bits for exhaustive search. In the table, 




76 



Peizhong Lu and Lianzhen Huang 



E(v 2 ) 

P 


£ = 18 
771 — 190 


B = 19 
m = 380 


B = 20 
m = 761 


B = 21 
771 — 1522 


£ = 22 
771 — 3045 


0.30 


134.28 


235.45 


429.58 


798.75 


1499.60 


0.31 


113.64 


195.86 


353.78 


654.30 


1224.96 


0.32 


96.01 


162.05 


289.07 


530.94 


990.42 


0.33 


81.09 


133.42 


234.28 


426.51 


291.98 


0.34 


68.58 


109.43 


188.35 


338.98 


625.45 


0.35 


58.21 


89.54 


150.01 


266.40 


487.47 


0.36 


49.71 


73.25 


119.10 


206.98 


374.59 


0.37 


42.86 


60.10 


93.93 


159.01 


283.30 


0.38 


37.42 


49.66 


73.95 


120.92 


210.87 


0.39 


33.18 


41.53 


58.39 


91.26 


154.48 


0.40 


29.95 


35.34 


46.53 


68.67 


111.53 


T ! 


44.18 


42.80 


41.40 


40.00 


38.58 



Table 1: The distinguishability between 772 and 771 . 



E(t/ 2 ) stands for the expectation of 772 , and T\ is the threshold satisfying P{rji > 
T\) = 0.005. 

Clearly, when E(rj 2 ) < 772 and rji are undistinguishable. We call P(r ) 2 < 

T\) the undistinguishable probability of 772 and rj\. To compute P(rj 2 < Ti), we 
need the following lemma. 



Lemma 4.7. ([9]) Let u r be the rth central moment of binomial distribution B(n,p), 
i.e., u r = E(x — np) r , where r > 2. T/ien we have the following recursion formula 



r—2 r—2 

u r = npq^T^C l r _ l Ui 

2=0 2=0 



Lemma 4.8. (Lindeberg-Levy theorem[10]) If X\, X 2 , . . . , X n is a sequence of inde- 
pendent random variables and E(Xk) = a, D(Xk) — cr 2 (cr 2 > 0), k = 1, 2, . . . , n, 
then 



ELi (Xk-na) 

/t2 



N( 0,1), 



i.e., 



lim p( 

n— >00 



ELi (^fc ~ na ) 



< 7 ^/™ 





e 2 dt. 



Theorem 4.9 Le£ m be sufficiently large , S and Si have distribution B(m , 1/2 + s) 

or B(m , 1/2 - e), X 0 = (%#) 2 , and X, = (%^) 2 . Le( % = *o + £b+i *i, 

2 2 

E(Xi) = a, D(Xi) = a 2 , and L — B + 1 fre sufficiently large . Then 

(2—1 — 4s 2 -b 4ras 2 , (4.5) 

a 2 = 25 W(“ 512 “ 512m - 12288ms 2 |m 3 + 8192s 2 

+40960ms 4 - 24576s 4 + 4096m 2 s 2 - 16384m 2 s 4 (4.6) 

— 16m 3 s 2 — 4000m 3 s 4 — 256m 3 s 6 + 256772 3 s 8 ) 





A New Correlation Attack on LFSR Sequences 



77 



and 



Pfa < Ti) = *( 



T\ — {L — B + 1)<2 



g VL — B -J- 1 
where $(a) is the distribution function of N( 0, 1). 



), 



(4.7) 



Proof By the proof of Corollary 4.6, we know that X{ have the same expectations. 
Now we consider the variances of X % . We have 

D(Xi) = E(Xf) - (. E{Xi)f = + —(2 p - 1)« 3 + 24(2p — l) 2 u 2 

m z m 

+ 8m(2p - l) 3 ui + m 2 (2p — l) 4 - ( E(Xi )) 2 
where u r is the rth central moment of S z . By Corollary 4.6, we have 

a = E(Xi ) = 1 — 4s 2 + 4ra£ 2 . 

Note that q = 1 — p, = 1, i/i = 0, U 2 = mpq, pq = — £ 2 ), (2p- 1) 2 = 4e 2 , 

and by Lemma 4.7, 

us = mpq(2p — 1). 

Thus 

16 32^ 16 0 N 48p, N 32 /rt 

—it 4 H (2 p- l)us = —{mpq + 3 mpqu 2 ) j (u 2 + u 3 ) H (2p - 1 )u 3 , 

m ui m m m 

and 

j («2 + « 3 ) + — (2p - l)it 3 = -{—(pq) 2 + 32pg(2p - l) 2 ). 
ra z m m 

Hence 

£>(**) = ^ ( mpq + 3 mpqu 2 ) - ^(u 2 + u 3 ) + ^( 2 P - 1)« 3 + 24(2p - l) 2 u 2 
+8m(2p - 1)V + m 2 (2p - l) 4 - ( E(X { )) 2 
= - g2 ) + 3 ™ 2 (l - s 2 ) 2 ) - (f (i - ^ 2 ) 2 + 32(| - £ 2 )4e 2 ) 

+96e 2 m(| — s 2 ) + m 2 (| — s 2 ) 4 — (1 — 4s 2 + 4ms 2 ) 2 
= 25g^(-512 + 512m - 12288ms 2 + 8192s 2 + 40960ms 4 
—24576s 4 + m 3 + 4096m 2 s 2 — 16384m 2 s 4 — 16m 3 s 2 
— 4000m 3 s 4 — 256m 3 s 6 + 256m 3 s 8 ) . 



i.e., 



By Lemma 4.7, we know 

t?2 - (L - g + l)a 
a \fL — B + 1 



~ 2V (0, 1), 



P(r/2 <T 1 ) = P 
= $ 



f ife -(L-g + l)a ^ 7j - (L - B + 1)<A 

V - B + 1 < - B + 1 ) 

f T 1 -(L-B + l)o \ 

V - B + 1 ) ' 



□ 



By Theorem 4.9, when L — B + 1 is sufficiently large, the threshold value T\ 
can be used to distinguish rj 2 and rji . The error decoding rate p e = P(r ]2 < Ti) 




78 



Peizhong Lu and Lianzhen Huang 



can be computed according to formula (4.7). Generally, when L — B - 1-1 > 20, the 
precision of approximation is very satisfying. There are some experimental results 
on the approximation in Section 6. 

Theorem 4.10. Let p = 1/2 — 8 = P(z n ^ u n ),w be the weight of parity- check 
equation , and N,L,B as defined before. Then the performance of our correlation 
attack by using (w + 1 ) -weight parity check equations is better than the one by only 
using w-weight equations if and only if 

2*t/-TT > 1- ( 4 -8) 

V w + 1 

Proof. Let N w be the expected value of 772 when the weight of parity check equation 
is w. By Corollary 4.6, N w = (L — B 4- 1)(1 — 4e 2 + 4 x ^ ^ ^ ^ 2 ) 5 and 

£ = 2 W ~ 1 8 W . We consider the difference between N w and N w 4- 1, 

N w+1 -N w = 2 b ~ l+2 (2 w - 1 S w ) ( ^ ~ !)• 

Thus 2 5yJ > 1 if and only if N w + \ — N w > 0. Because 771 has the chi-square 
distribution with L — B + 1 degrees of freedom, it is irrelevant to w. Therefore, the 
bigger the 772 of expectation is, the better the distinguishable property between 772 
and 771 performances. Thus we conclude that the performance of our correlation 
attack by using (w 4- 1)- weight parity check equations is better than the one by 

only using w-weight equations if and only if 28 J >1. □ 



5. Our New Algorithms 

The main underlying principles for construction of the novel fast correlation attack 
include the following: 

1. A partial exhaustive search for the first B information bits enhances the 
performance of the fast correlation attack. 

2 . Statistical threshold ensures a precision decision for efficiently finding the 
correct first B information bits. 

3. Repeatedly using one-step decoding technology makes our new approach fast 
with low computational complexity. 

According to these principles a new algorithm for the fast correlation attack is pro- 
posed. The algorithm is based on the parity-check sets in Section 3. The threshold 
Ti is used to determine if a hypothesis of the first B information bits is right. T\ 
can be calculated according to the method in Section 4 for a given distinguishable 
probability 1 — d between 771 and 772 . The threshold T for correlation checks can be 
calculated by a method in [ 8 ]. The thresholds T, for the zth information bit can 
be calculated according to a method in [ 1 ]. 




A New Correlation Attack on LFSR Sequences 



79 



Algorithm A: 

INPUT: 

The parameters TV , L, B , the thresholds T, Ti and TJ; 

The received noisy bits z\, Z2, . . . , z;v; 

The parity-check equations sets and fT for z = B + 1, . . . , L. 

PROCESSING STEPS: 

Step 1: Setting the hypothesis 

From the set of all possible 2 B binary patterns, select a not previously considered 
pattern (xi,X 2 for the first B information bits. If no new pattern is 
available, go to step 3. 

Step 2: Decoding 

(a) Calculate Si and 5, the number of passed-parity-check equations in Qi and 
Q*. Then calculate rj. If rj < Ti, go to step 1. (Si, 5, and r? are specified in 
Section 4) 

(b) For every z, if Si < T\, then Xi — Zi 0 1, else Xi — z im 

(c) Check if the current estimation of the information bits (x\,X 2 , . . . ,£l) is a 
true one according to the following: 

For (x\,X2 generate the corresponding sequence X\,X2 ,...,xn, and 
calculate 5* = Yln= i( x n 0 z n )- If S* < T, go to OUTPUT (a), otherwise 
store (xi,X 2 , . . . ,xb) into the set A. 

Step 3: Twice-step decoding 

(a) For every (x \ , X 2 , . . . , xb) stored in the set A, transform the linear [N, L]-code 
into linear [N, L — B]-code. 

(b) Decode the linear [N,L — £]-code. If the decoding succeeds, goto OUT- 
PUT(a), otherwise goto OUTPUT (b). 

OUTPUT: 

(a) Output the result [xi,X 2 , . . . ,xl\ as [u\,U 2 , • . . ,ul\\ 

(b) The correlation attacks fail, the correct information bits are not found. 

Remark 5.1. With knowledge of the first B information symbols, the problem of 
restoring the remaining L — B bits is much more simple compared to the original 
problem. Hence we can discard the computational complexity of the twice-step 
decoding processing. 

Similarly, we present another new algorithm B by a simple ML-decoding 
procedure. 

Let Fo = S. If Si > m/2, then Fi = Si, else F t = m — Si. 

Algorithm B: 

INPUT: TV, L, B; the received noise sequence z\, Z 2 , . • . , the parity-check sets 
Qi and 0*. 

DECODING: Exhaustively search 2 B possibilities to find a vector (xi, X 2 , . . . , xb) 
such that the sum Fq 0 '52f=B+i ^ maximal. 




80 



Peizhong Lu and Lianzhen Huang 



6. Performance Evaluation 

6.1. Complexity 

The computational complexity can be divided into two parts, the time for pre- 
processing and the decoding time. In part of preprocessing, the calculation of all 
parity-check equations is of order 0(N W ~ 1 log N). We also need to store each par- 
ity check equation, which is composed of its index positions and a 5-bits vector, 
in Q* and Thus the storage requirement is at most (L — B 4 - l)m(B 4 - wlog 2 N). 
If we store z ik instead of their index positions, then the storage requirement 

is at most (L - B 4 - l)m(B 4 - 1 ). 

The complexity of the decoding step is given as follows: 

Corollary 6.1. Let W be the weight of the LFSR characteristic polynomial, 1 — d the 
distinguishable probability. Then the complexity of our algorithm A is proportional 
to 2 b [(L — 54 1 )m 4- (N — L)Wd\ mod 2 additions. 

Corollary 6.2. The complexity of the proposed algorithm B is proportional to 2 B (L— 
B 4- l)m mod 2 additions. 

6.2. Simulation 

We have made plentiful experiments to evaluate the performance of our new al- 
gorithms. The LFSR characteristic polynomial we have chosen is 1 4 - x 4 - x 3 4 - 
x 5 4 x 9 4 x 11 4- x 12 4 - x 17 4- x 19 4- x 21 4- x 25 4- x 27 -f x 29 4 - x 32 4- x 33 + x 38 4- x 40 . 
N = 40000, w — 2, the distinguishable probability between 771 and 772 is 1 — d— 
0.995. 

Firstly, we compare the restored proportion 5(771 > T\) with given d. Table 
2 shows that the actual values of 5(771 > T\) are close to the expected values d. 



B 




d 


m 


T 1 


P(r]i > Ti) 


19 


0.36 


0.005 


380 


42.8 


0.00495 


20 


0.34 


0.005 


761 


41.4 


0.00497 


21 


0.38 


0.005 


1522 


40 


0.00498 



Table 2. Comparison of the simulation result with the theoretic value 

Table 3 compares the error decoding probability of algorithm A with our 
theoretical estimation from formula (4.7) in Theorem 4.9. Notice that although 
L — B 4- 1 is around 20, the actual values are close to our theoretical estimations, 
the difference is about 0.05. 

Table 4 compares the performance of our algorithm A with the algorithm of 
[3]. The parameters N , w, d and the characteristic polynomial are the same as 
above. We made 1000 times random experiments under the condition that B = 
18, 19, 20, 21, 22 and p is between 0.30 and 0.40. 

In Table 4, NewA means our new algorithm A, [3] means the algorithm in [3]. 
The data in Table 4 shows that the performance of our algorithm is significantly 
better than that of the algorithm of [3] . 





A New Correlation Attack on LFSR Sequences 



81 



p 


The error probability of decoding p e 


B = 18, m = 19 


£ = 19, m = 380 


B = 20, m = 761 


theoretic 


Simulation 


theoretic 


Simulation 


theoretic 


Simulation 


0.30 


0.000 


0.000 


0.000 


0.000 


0.000 


0.000 


0.33 


0.013 


0.005 


0.000 


0.000 


0.000 


0.000 


0.36 


0.327 


0.252 


0.026 


0.007 


0.000 


0.000 


0.39 


0.883 


0.832 


0.548 


0.490 


0.110 


0.076 



Table 3. Comparison of the actual decoding error probability 
with our theoretic estimation 



V 


The error probability of decoding 


£=18 m=190 


£=19 


ra=380 


£=20 


m=761 


B = 21 


m=1522 


B = 22 m=3045 


New A 


[3] 


New A 


[3] 


New A 


[3] 


New A 


[3] 


NewA 


[3] 


0.30 


0.000 


0.254 


0.000 


0.023 


0.000 


0.000 


0.000 


0.000 


0.000 


0.000 


0.31 


0.000 


0.384 


0.000 


0.041 


0.000 


0.002 


0.000 


0.000 


0.000 


0.000 


0.32 


0.000 


0.569 


0.000 


0.098 


0.000 


0.002 


0.000 


0.000 


0.000 


0.000 


0.33 


0.005 


0.696 


0.000 


0.226 


0.000 


0.020 


0.000 


0.000 


0.000 


0.000 


0.34 


0.020 


0.838 


0.000 


0.356 


0.000 


0.053 


0.000 


0.001 


0.000 


0.000 


0.35 


0.086 


0.915 


0.001 


0.542 


0.000 


0.114 


0.000 


0.002 


0.000 


0.000 


0.36 


0.252 


0.955 


0.007 


0.743 


0.000 


0.225 


0.000 


0.019 


0.000 


0.000 


0.37 


0.471 


0.983 


0.075 


0.865 


0.000 


0.450 


0.000 


0.080 


0.000 


0.001 


0.38 


0.695 


0.990 


0.243 


0.932 


0.007 


0.652 


0.000 


0.210 


0.000 


0.023 


0.39 


0.832 


0.997 


0.490 


0.980 


0.076 


0.850 


0.000 


0.445 


0.000 


0.052 


0.40 


0.921 


1.000 


0.729 


0.988 


0.292 


0.935 


0.005 


0.663 


0.000 


0.267 



Table 4. Comparison of the new algorithm A with the algorithm A of [3] 



Table 5 compares the performance of the our new algorithms A and B with 
the algorithm presented in [1] . 





The error probability of decoding 




P 


7V=45000,m=1941 
Algorithm in [1] 


AT=18000,m=308 
Algorithm B 


AT=18000,m=308, 
d = 0.005 
Algorithm A 


0.33 


0.39 


0.03 


0.00 


0.34 


0.59 


0.18 


0.00 


0.35 


0.75 


0.42 


0.01 


0.36 


0.89 


0.67 


0.09 


0.37 


0.98 


0.82 


0.27 


0.38 


1.00 


0.99 


0.52 



Table 5. The performance comparison between 
our new algorithm and one in [1] 







82 



Peizhong Lu and Lianzhen Huang 



The performance of our new algorithm is superior to the algorithm presented 
in [3], because the success of the algorithm presented in [3] depends on the suc- 
cessful decoding of every information bit. So long as there is one information bit 
decoding error, the whole attack will fail. The authors of [4] have partially overcome 
this shortcoming. But they needed D — B sets of parity-check equations, where 
D > L. When D increases, the complexities of precomputation and decoding will 
also linearly increase. Moreover, the decoding processing in [4] needs 2 B+1 initial 
states to be correlationally checked. This becomes the main part of computational 
complexity of decoding. Recently, this approach was improved algorithmically by 
Chose et al. [2]. 

However, the successful attack of our new algorithm does not rely on the 
successful decoding of special information bits. It is decided by all the L — B + 1 
parity-check sets holistically. The initial states to be correlationally checked in 
our algorithm are less than 2 B d where d < 0.005. When the cardinality m of the 
parity-check sets is large, our algorithm can precisely distinguish 772 and rji. 

The decoding complexity of the algorithm of [3] is 2 B [(L — B + l)rrtw 4- (N — 
L)w\. The decoding complexity of our algorithm is 2 B [(L — B -f l)m+ (N — L)wd\. 
Since d is very small, for example d = 0.005, and w > 2, our algorithm improves 
the decoding complexity. 

Compared with the algorithm in [1], our new algorithm uses L — B times 
more parity-check equations. This is an important reason why the performance of 
our new algorithm is superior to the algorithm in [1]. But the complexity of the 
Algorithm in [1] is 2 B ra, which is lower than the new one. 



7. Conclusions 

We present a new powerful algorithm for fast correlation attacks. It involves more 
parity-check sets than other algorithms reported. The performance of the new al- 
gorithm is significantly improved with relatively low computational complexity. In 
particular, we use some random variables and their distributions to determine a 
statistical threshold which guarantees a precise decision for efficiently finding the 
correct first B information bits. We also find a new formula to describe the rela- 
tionship between the tendency of attack performance, the weight w of parity check 
equations, the noise level 5, and the required length N of the observed sequence. 
We believe that, with a set of parallel PCs and a few weeks of precomputation, 
our algorithm can carry out the correlation attack on the LFSRs of length 80-100 
and p = 0.4 in one PC in several hours by using B = 35 and t > 3. 

Acknowledgments 

The first author was supported by National Natural Science Foundation of China 
(10171017, 90204013), Special Funds of Authors of Excellent Doctoral Dissertation 
in China, and Science and Technology Funds of Shanghai (035115019). 




A New Correlation Attack on LFSR Sequences 



83 



References 

[1] V. Chepyzhov, T. Johansson, and B. Smeets, A simple algorithm for fast correlation 
attacks on stream ciphers, Lecture Notes in Computer Science, vol. 1978, 2001, pp. 
181-195. 

[2] P. Chose, A. Joux, and M. Mitton, Fast correlation attack: an algorithmic point of 
view, Lecture Notes in Computer Science, vol. 2332, pp. 209-221, April 2002. 

[3] M. Mihaljevic, M. Fossorier, and H. Imai, A low-complexity and high-performance 
algorithm for the fast correlation attack, Lecture Notes in Computer Science, vol. 
1978, 2001, pp. 196-212. 

[4] M. Mihaljevic, M. Fossorier, and H. Imai, Fast correlation attack algorithm with list 
decoding and an application, Lecture Notes in Computer Science, vol. 2355, 2002, 
pp. 196-210. 

[5] T. Johansson and F. Jonsson, Theoretical analysis of a correlation attack based on 
convolution codes, IEEE Transactions on Information Theory, vol. 48, no. 8, August 
2002, pp. 2173-2181. 

[6] T. Johansson and F. Jonsson, Improved fast correlation attacks on stream ciphers 
via convolutional codes, Lecture Notes in Computer Science, vol. 1592, 1999, pp. 
347-362. 

[7] W. Meier and O. Staffelbach, Fast correlation attacks on certain stream ciphers, J. 
Cryptology, vol. 1, 1989, pp. 159-176. 

[8] T. Siegent haler, Decrypting a class of stream ciphers using ciphertext only, IEEE 
Trans. Comput., vol. C-34, 1985, pp. 81-85. 

[9] K. Fang, J. Xu, Statistical Distribution, Science in China Press, Beijing, 1987, in 
Chinese, pp. 67. 

[10] Z. Wei, etc., Introduction to Probability Theory and Statistics, Higher Education 
Press, Beijing, 1983, in Chinese, pp. 209. 



Peizhong Lu and Lianzhen Huang 

Department of Computer Sciences and Engineering 

and Institute of Mathematics 

Fudan University 

Fengzhen Road 85, Jiangwan Town 
Shanghai, Postcode 200434 
China 

e-mail: pzluOf udan . edu . cn 




Progress in Computer Science and Applied Logic, Vol. 23, 85-110 
© 2004 Birkhauser Verlag Basel/Switzerland 



LDPC Codes: An Introduction 

Amin Shokrollahi 



Abstract. LDPC codes are one of the hottest topics in coding theory today. 
Originally invented in the early 1960’s, they have experienced an amazing 
comeback in the last few years. Unlike many other classes of codes, LDPC 
codes are already equipped with very fast (probabilistic) encoding and decod- 
ing algorithms. The question is that of the design of the codes such that these 
algorithms can recover the original codeword in the face of large amounts of 
noise. New analytic and combinatorial tools make it possible to solve the de- 
sign problem. This makes LDPC codes not only attractive from a theoretical 
point of view, but also perfect for practical applications. In this note I will 
give a brief overview of the origins of LDPC codes and the methods used for 
their analysis and design. 

Keywords. LDPC codes, graph based codes. 



1. Introduction 

This note constitutes an attempt to highlight some of the main aspects of the 
theory of low-density parity-check (LDPC) codes. It is intended for a mathemati- 
cally mature audience with some background in coding theory, but without much 
knowledge about LDPC codes. 

The idea of writing a note like this came up during conversations that I 
had with Dr. Khosrovshahi, head of the Mathematics Section of the Institute for 
Studies in Theoretical Physics and Mathematics in December 2002. The main mo- 
tivation behind writing this note was to have a written document for Master’s and 
PhD students. The style is often informal, though I have tried not to compromise 
exactness. 

The note is by no means a complete survey. I have deliberately left out 
a number of interesting aspects of the theory, such as connections to statistical 



Most of the work on this paper was done when the author was visiting the Institute for Theoretical 
Physics and Mathematics (IPM) in Tehran. 




86 



A. Shokrollahi 



Xi 

X2 

X3 

X4 

X5 

Xq 

Xl 

Xg 

Xg 

£10 



Xl + X2 + X3 + X4 + X 6 + Xg + X10 = 0 

Xl H- X 3 + £4 + X 7 + X 8 + xg + XlO = 0 

X2 “I - X4 -h Xg — 0 

Xl + £5 + X 7 + Xg + Xg + X10 = 0 

X3 + X4 + X5 + X? + Xg — 0 



Figure 1 . An LDPC code 



mechanics. The important topics of general Tanner graphs, and factor graphs as 
well as connections to Turbo codes have also been left untouched. 

My emphasis in writing the notes has been on algorithmic and theoretical 
aspects of LDPC codes, and within these areas on statements that can be proved. I 
have not discussed any of the existing and very clever methods for the construction 
of LDPC codes, or issues regarding their implementation. 

Nevertheless, I hope that this document proves useful to at least some stu- 
dents or researchers interested in pursuing research in LDPC codes, or more gen- 
erally codes obtained from graphs. 




2. LDPC Codes 

LDPC codes were invented by Robert Gallager [13] in his PhD thesis. Soon after 
their invention, they were largely forgotten, and reinvented several times for the 
next 30 years. Their comeback is one of the most intriguing aspects of their history, 
since two different communities reinvented codes similar to Gallager ’s LDPC codes 
at roughly the same time, but for entirely different reasons (see [7, 19, 18, 21, 20, 
35, 36]). 

LDPC codes are linear codes obtained from sparse bipartite graphs. Suppose 
that Q is a graph with n left nodes (called message nodes) and r right nodes (called 
check nodes). The graph gives rise to a linear code of block length n and dimension 
at least n — r in the following way: The n coordinates of the codewords are associ- 
ated with the n message nodes. The codewords are those vectors (ci, . . . , c n ) such 




LDPC Codes: An Introduction 



87 



that for all check nodes the sum of the neighboring positions among the message 
nodes is zero. Figure 1 gives an example. 

The graph representation is analogous to a matrix representation by looking 
at the adjacency matrix of the graph: let H be a binary r x n-matrix in which the 
entry (i,j) is 1 if and only if the zth check node is connected to the jth message 
node in the graph. Then the LDPC code defined by the graph is the set of vectors 
c— (ci, . . . , c n ) such that H c T = 0. The matrix H is called a parity check matrix 
for the code. Conversely, any binary r x n-matrix gives rise to a bipartite graph 
between n message and r check nodes, and the code defined as the null space of 
H is precisely the code associated to this graph. Therefore, any linear code has 
a representation as a code associated to a bipartite graph (note that this graph 
is not uniquely defined by the code). However, not every binary linear code has 
a representation by a sparse bipartite graph. 1 If it does, then the code is called a 
low-density parity-check (LDPC) code. 

The sparsity of the graph structure is key property that allows for the algo- 
rithmic efficiency of LDPC codes. The rest of this note is devoted to elaborating 
on this relationship. 



3. Decoding Algorithms: Belief Propagation 

Let me first start by describing a general class of decoding algorithms for LDPC 
codes. These algorithms are called message passing algorithms , and are iterative 
algorithms. The reason for their name is that at each round of the algorithms 
messages are passed from message nodes to check nodes, and from check nodes back 
to message nodes. The messages from message nodes to check nodes are computed 
based on the observed value of the message node and some of the messages passed 
from the neighboring check nodes to that message node. An important aspect is 
that the message that is sent from a message node v to a check node c must not 
take into account the message sent in the previous round from c to v. The same 
is true for messages passed from check nodes to message nodes. 

One important subclass of message passing algorithms is the belief propaga- 
tion algorithm. This algorithm is present in Gallager’s work [13], and it is also 
used in the Artificial Intelligence community [28] . The messages passed along the 
edges in this algorithm are probabilities, or beliefs. More precisely, the message 
passed from a message node v to a check node c is the probability that v has a 
certain value given the observed value of that message node, and all the values 
communicated to v in the prior round from check nodes incident to v other than 
c. On the other hand, the message passed from c to v is the probability that v 
has a certain value given all the messages passed to c in the previous round from 
message nodes other than v. 



lr To be more precise, sparsity only applies to sequences of matrices. A sequence of m x n-matrices 
is called c-sparse if mn tends to infinity and the number of nonzero elements in these matrices 
is always less than cmax(m, n). 




88 



A. Shokrollahi 



It is easy to derive formulas for these probabilities under a certain assump- 
tion called independence assumption , which I will discuss later. It is sometimes 
advantageous to work with likelihoods, or sometimes even log-likelihoods instead 
of probabilities. For a binary random variable x let L(x) = Pr[x = 0]/Pr[ar = 1] 
be the likelihood of x. Given another random variable y , the conditional likeli- 
hood of x denoted L(x \ y) is defined as Pr[x = 0 | y\/ Pr[x = 1 | y\. Similarly, 
the log-likelihood of x is In L(x), and the conditional log-likelihood of x given y is 
lnL(x\y). 

If x is an equiprobable random variable, then L(x | y) = L(y | x) by Bayes’ 
rule. Therefore, if pi, . . . , yd are independent random variables, then we have 

d 

lnL(x | 2/i , , 2/d) = In L(x \ yi). (3.1) 

i= 1 



Now suppose that x \ , . . . , X£ are binary random variables and y \ , . . . , ye are random 
variables. Denote addition over F 2 by ©. We would like to calculate In L(x 1 0 • • • 0 
xe | 2 / 1 , . . . ,ye). Note that if p = 2Pr[xi = 0 | y\\ — 1 and q = 2Pt[x2 = 0 | 2 / 2 ] — 1? 
then 2Pr[xi 0 x<i = 0 | y\, 2 / 2 ] ~ 1 = pq- (Why?) Therefore, 2Pr[xi 0 • • • 0 xe = 
0 | j/i,..., y*] - 1 = nf=i(2P r [*i = 0 | yi} - 1). Since Pr[xi = 0 | y t ] = L{x t \ 
yi)/(l + L(xi 1 1 /,)), we have that 2Pr[x ; = 0 | t/i]-l = (L-l)/{L+l) = tanh(£/2), 
where L = L(xi \ yi) and ( = \n L. Therefore, we obtain 



In L(x i © • • • © x e | j/i , . . . , ye ) = In 



1 + (nli tanh(£i/2)) 
1 - (nlitanh^/2)) 



(3-2) 



where £{ = In L(xi \ yi)- The belief propagation algorithm for LDPC codes can be 
derived from these two observations. In round 0, the check nodes send along all 
the outgoing edges their log- likelihoods conditioned on their observed value. For 
example, if the channel used is the BSC with error probability p, then the first 
message sent to all the check nodes adjacent to a message node is ln(l — p) — lnp 
if the node’s value is zero, and it is the negative of this value if the node’s value 
is one. In all the subsequent rounds of the algorithm a check node c sends to an 
adjacent message node v a likelihood according to (3.2). A message node v sends 
to the check node c its log-likelihood conditioned on its observed value and on 
the incoming log-likelihoods from adjacent check nodes other than c using the 
relation (3.1). 

Let mic^ be the message passed from message node v to check node c at the ^th 
round of the algorithm. Similarly, define . At round 0, is the log-likelihood 
of the message node v conditioned on its observed value, which is independent of 
c. We denote this value by m v . Then the update equations for the messages under 
belief-propagation can be described as 



= 




ec.\{c} m c'v 



(t- 1 ) 



if t = 0, 
if i > 1, 



(3.3) 




LDPC Codes: An Introduction 



89 



w _ 1 + rW A {v} tanh ( m l'c/ 2 ) 

1 - llv'evAW tanh (“I'c/ 2 ) 



(3.4) 



where C v is the set of check nodes incident to message node v, and V c is the set 
of message nodes incident to check node c. 

The computations at the check nodes can be simplified further by performing 
them in the log-domain. Since the value of tanh(x) can be negative, we need to 
keep track of its sign separately. Let 7 be a map from the real numbers [—00, 00] to 
F2 x [0, 00] defined by j(x) := (sgn(x), — lntanh(|x|/2)) (we set sgn(x) = 1 if x > 1 
and sgn(x) = 0 otherwise.) It is clear that 7 is bijective, so there exists an inverse 
function 7 -1 . Moreover, 7 (xy) = 7(2;) + 7 (y), where addition is component- wise 
in F2 and in [0, 00]. Then it is very easy to show that (3.4) is equivalent to 




We will use this representation when discussing density evolution later. 

In practice, belief propagation may be executed for a maximum number of 
rounds or until the passed likelihoods are close to certainty, whichever is first. A 
certain likelihood is a likelihood in which lnL(x | y) is either 00 or —00. If it is 00, 
then Pr[x = 0 | y\ = 1, and if it is —00, then Pr[x = 1 | y] = 1. 

One very important aspect of belief propagation is its running time. Since the 
algorithm traverses the edges in the graph, and the graph is sparse, the number of 
edges traversed is small. Moreover, if the algorithm runs for a constant number of 
times, then each edge is traversed a constant number of times, and the algorithm 
uses a number of operations that is linear in the number of message nodes! 

Another important note about belief propagation is that the algorithm itself 
is entirely independent of the channel used, though the messages passed during 
the algorithm are completely dependent on the channel. 

One question that might rise is about the relationship of belief propagation 
and maximum likelihood decoding. The answer is that belief propagation is in 
general less powerful than maximum likelihood decoding. In fact, it is easy to con- 
struct classes of LDPC codes for which maximum likelihood decoding can decode 
many more errors than belief propagation (one example is given by biregular bi- 
partite graphs in which the common degree of the message nodes is very large but 
the reader is not required to see this right away). 



4. Asymptotic Analysis of Belief Propagation 
and Density Evolution 

The messages passed at each round of the belief propagation algorithm are random 
variables. If at every round in the algorithm the incoming messages are statistically 
independent, then the update equation correctly calculates the corresponding log- 




90 



A. Shokrollahi 



likelihood based on the observations. (This is what I meant by the independence 
assumption above.) This assumption is rather questionable, though, especially 
when the number of iterations is large. In fact, the independence assumption is 
correct for the £ first rounds of the algorithm only if the neighborhood of a message 
node up to depth £ is a tree. 

Nevertheless, belief propagation can be analyzed using a combination of tools 
from combinatorics and probability theory. The first analysis for a special type of 
belief propagation appeared in [16], and was applied to hard decision decoding of 
LDPC codes in [18]. The analysis was vastly generalized in [31] to belief propaga- 
tion over a large class of channels. 

The analysis starts by proving that if £ is fixed and n and r are large enough, 
then for random bipartite graphs the neighborhood of depth £ of most of the 
message nodes is a tree. Therefore, for £ rounds the belief propagation algorithm 
on these nodes correctly computes the likelihood of the node. Let us call these 
nodes the good nodes. We will worry about the other nodes later. 

Next the expected behavior of belief propagation is calculated by analyzing 
the algorithm on the tree, and a martingale is used to show that the actual behavior 
of the algorithm is sharply concentrated around its expectation. This step of the 
analysis is rather standard, at least in Theoretical Computer Science. 

Altogether, the martingale arguments and the tree assumption (which holds 
for large graphs and a fixed iteration number £) prove that a heuristic analysis of 
belief propagation on trees correctly mirrors the actual behavior on the full graph 
for a fixed number of iterations. The probability of error among the good message 
nodes in the graph can be calculated according to the behavior of belief propaga- 
tion. For appropriate degree distributions this shows that the error probability of 
the good message nodes in the graph can be made arbitrarily small. What about 
the other (non-good) message nodes? Since their fraction is smaller than a con- 
stant, they will contribute only a sub-constant term to the error probability and 
their effect will disappear asymptotically, which means that they are not relevant 
for an asymptotic analysis. Details can be found in the above mentioned literature. 

The analysis of the expected behavior of belief propagation on trees leads to 
a recursion for the density function of the messages passed along the edges. The 
general machinery shows that, asymptotically, the actual density of the messages 
passed is very close to the expected density. Tracking the expected density during 
the iterations thus gives a very good picture of the actual behavior of the algorithm. 
This method, called density evolution [31, 29, 18], is one of the crown jewels of the 
asymptotic theory of LDPC codes. In the following, I will briefly discuss this. 

As a first remark note that if X\ , . . . , Xd are i.i.d. random variables over some 
(additive) group G, and if / is the common density of the Xi , then the density F 

of Ah b Xd equals the d-fold convolutional power of /. (For any two integrable 

functions / and g defined over G the convolution of / and g , denoted / (g) g, is 
defined as (/ gg)(r) = f G f(a)g(r — a) d G, where dG is the Haar measure on G.) 
If G is the group of real numbers with respect to multiplication, then / <g g is the 
well known convolution of real functions. 




LDPC Codes: An Introduction 



91 



Let now gi denote the common density function of the messages mcv sent 
from check nodes to message nodes at round i of the algorithm, and let / denote 
the density of the messages m v , i.e., the likelihood of the messages sent at round 

0 of the algorithm. Then the update rule for the densities in (3.3) implies that 
the common density /*+ 1 of the messages sent from message nodes to check nodes 
at round i + 1 conditioned on the event that the degree of the node is d equals 

r _ ®(d- 1) 

f®9i • 

Next we assume that the graph is random such that each edge is connected 
to a message node of degree d with probability A and each edge is connected to 
a check node of degree d with probability p^. Then the expected density of the 
messages sent from message nodes to check nodes at round i + 1 is /(g) A(p^), where 
A(<7;) = (All this of course assumes the independence assumption.) 

To assess the evolution of the densities at the check nodes, we need to use the 
operator 7 introduced above. For a random variable A on [—00, 00] with density 
F let T(F) denote the density of the random variable 7(A). 7(A) is defined on the 
group G := F2 x [0, 00]. Therefore, the density of 7(A) 4- 7(F) is the convolution 
(over G ) of T(F) and T(H ), where H denotes the density of Y. Following (3.5) 
and assuming independence via the independence assumption, we see that the 
common density gi of the messages passed from check to message nodes at round 

1 is r - 1 (p(r(/i))), where p(h) = ^ 2 d pdh^ d ~^ . All in all, we obtain the following 
recursion for the densities fa: 

(4.1) 

This recursion is called density evolution. The reason for the naming should be 
obvious. 

I have not made the recursion very explicit. In fact, the operator T has not 
been derived at all. For that I refer the reader to [31] and [29]. 

Density evolution can be used in conjunction with Fourier Transform tech- 
niques to obtain asymptotic thresholds below which belief propagation decodes the 
code successfully, and above which belief propagation does not decode successfully 
([31, 29]). 

Density evolution is exact only as long the incoming messages are independent 
random variables. For a finite graph this can be the case only for a small number 
of rounds. 



5. Decoding on the BEC 

Perhaps the most illustrative example of belief propagation is when it is applied 
to LDPC codes over the BEC with erasure probability p. In fact, almost all the 
important and interesting features of the belief propagation algorithm are already 
present on the BEC. A thorough analysis of this special case seems thus to be a 
prerequisite for the general case. 




92 



A. Shokrollahi 



It is sufficient to assume that the all-zero codeword was sent. The log-likeli- 
hood of the messages at round 0, m v , is +oo if the corresponding message bit is not 
erased, and it is 0 if the message bit is erased. Moreover, consulting the update 
equations for the messages, we see that if v is not erased, then the message passed 
from v to any of its incident check nodes is always +oo. 

The update equations also imply that m cv is -boo if and only if all the message 
nodes incident to c except v are not erased. In all other cases m cv is zero. 

If v is an erased message node, then m*, = 0. The message m vc is -boo if and 
only if there is some check node incident to v other than c which was sending a 
message -foo to v in the previous round. 

Because of the binary feature of the messages, belief propagation on the 
erasure channel can be described much easier in the following: 

1. [Initialization] 

Initialize the values of all the check nodes to zero. 

2. [Direct recovery] 

For all message nodes v, if the node is received, then add its value to the values 
of all adjacent check nodes and remove v together with all edges emanating 
from it from the graph. 

3. [Substitution recovery] 

If there is a check node c of degree one, substitute its value into the value 
of its unique neighbor among the message nodes, add that value into the 
values of all adjacent check nodes and remove the message nodes and all 
edges emanating from it from the graph. 

This algorithm was first proposed in [17] though connections to belief propagation 
were not realized then. It is clear that the number of operations that this algorithm 
performs is proportional to the number of edges in the graph. Hence, for sparse 
graphs the algorithm runs in time linear in the block length of the code. However, 
there is no guarantee that the algorithm can decode all message nodes. Whether 
or not this is the case depends on the graph structure. 

The decoding algorithm can be analyzed along the same lines as the full belief 
propagation. First, we need to find the expected density of the messages passed 
at each round of the algorithm under the independence assumption. In this case, 
the messages are binary (either -foo or 0), hence we only need to keep track of one 
parameter, namely the probability pi that the messages passed from message nodes 
to check nodes at round i of the algorithm is 0. Let q % denote the probability that 
the message passed from check nodes to message nodes at round i of the algorithm 
is 0. Then, conditioned on the event that the message node is of degree d, we have 
Pi+i = p • qf~ l . Indeed, a message from a message node v to a check node c is 0 iff 
v was erased and all the messages coming from the neighboring check nodes other 
than c are 0, which is qf~ l under the independence assumption. Conditioned on 
the event that the check node has degree d we have qi = 1 - (1 — the check 

node c sends a message -foo to the message node v iff all the neighboring message 




LDPC Codes: An Introduction 



93 



nodes except for v send a message H-oo to c in the previous round. Under the 
independence assumption that probability is (1 — Pi) d ~ l , which shows the identity. 

These recursions are not in a usable form yet since they are conditioned on the 
degrees of the message and the check nodes. To obtain a closed form we use again 
the numbers A^ and p d defined above. Recall that A^ is the probability that an edge 
is connected to a message node of degree d , and pd denotes the probability that an 
edge is connected to a check node of degree d. Defining the generating functions 
A( x ) = J Z d A d and p(x) = J Z d Pd% d ~ l we obtain the following recursion using 
the formula for the total probability: 



Pi+i = P'A(l-p(l- pl- 
under the independence assumption, and assuming that the underlying graph 
is random with edge degree distributions given by A(x) and p(x ), decoding is 
successful if < (1 — e)pi for all i and some 0 < £ < 1. This yields the condition 

p • A(1 - p{ 1 — x)) < x for x e (0 ,p) (5.1) 

for successful decoding which was first proved in [19] and later reproduced in [17]. 
It is a useful and interesting exercise for the reader to show that (5.1) is identical 
to (4.1) in the case of the BEC. 

Condition (5.1) was proved in [19] in a completely different way than ex- 
plained here. A system of differential equations was derived whose solutions tracked 
the expected fraction of nodes of various degrees as the decoding process evolved. 
One of the solutions corresponds to the fraction of check nodes of reduced de- 
gree one during the algorithm. By keeping this fraction above zero at all times, 
it is guaranteed that in expectation there are always check nodes of degree one 
left to continue the decoding process. To show that the actual values of the ran- 
dom variables are sharply concentrated around their computed expectations, a 
large deviation result was derived which is not unsimilar to Azuma’s inequality for 
martingales. 

Condition (5.1) can be used to calculate the maximal fraction of erasures a 
random LDPC code with given edge degree distributions can correct using the 
simple decoding algorithm. For example, consider a random biregular graph in 
which each message node has degree 3 and each check node has degree 6. (Such 
a graph is called a (3, 6)-biregular graph.) In this case A(x) = x 2 and p(x) = x 5 . 
What is the maximum fraction of erasures pi (In fact, this value is a supremum.) 
You can simulate the decoder on many such random graphs with a large number of 
message nodes. The simulations will show that on average around 42.9% erasures 
can be recovered. What is this value? According to (5.1) it is the supremum of 
all p such that p( 1 — (1 — x) 5 ) 2 < x on (0,p). The minimum of the function 
x/(l — (1 — x) 5 ) 2 on (0, 1) is attained at the unique root of the polynomial 9x 4 — 
35x 3 + 50x 2 — 30x + 5 in the interval (0, 1), and this is the supremum value for 
p. This value can be computed exactly, using formulas for the solution of the 
quartic [2]. 




94 



A. Shokrollahi 



As a side remark, I would like to mention an interesting result. First, it is not 
hard to see that the ratio r/n between the message and the check nodes equals 
Jq p(x) dx/ fo A(x)dx. The rate of the code is at least 1 — r/n, and since the 
capacity of the erasure channel with erasure probability p is 1 - p, (5.1) should 
imply that p < p(x) dx/ f* X(x) dx in a purely mechanical way (without using 
the interpretations above). Can you see how to derive this? (See also [34].) 



6. Hard Decision Decoding on the BSC 

The belief propagation algorithm is the best algorithm among message passing 
decoders, and the accompanying density evolution provides a tool for analyzing the 
algorithm. However, for practical applications on channels other than the BEC the 
belief propagation algorithm is rather complicated, and often leads to a decrease in 
the speed of the decoder. Therefore, often times a discretized version of the belief 
propagation algorithm is used. The lowest level of discretization is achieved when 
the messages passed are binary. In this case one often speaks of a hard decision 
decoder, as opposed to a soft decision decoder which uses a larger range of values. 
In this section I will describe two hard decision decoding algorithms on the BSC, 
both due to Gallager [13]. 

In both cases the messages passed between the message nodes and the check 
nodes consist of 0 and 1. Let me first describe the Gallager A algorithm: in round 0, 
the message nodes send their received values to all their neighboring check nodes. 
From that point on at each round a check node c sends to the neighboring message 
node v the addition (mod 2) of all the incoming messages from incident message 
nodes other than v. A message node v sends the following message to the check 
node c: if all the incoming messages from check nodes other than c are the same 
value 6, then v sends the value b to c; otherwise it sends its received value to c. 

An exact analysis of this algorithm was first given in [18]. The analysis is 
similar to the case of the BEC. We first find the expected density of the messages 
passed at each round. Again, we can assume that the all- zero word was transmitted 
over the BSC with error probability p. Since the messages are 0 and 1, we only 
need to track pi , the probability that the message sent from a message node to a 
check node at round i is 1 . Let qi denote the probability that the message sent from 
a check node to a message node at round i is 1 . Conditioned on the event that the 
message node is of degree d, and under the independence assumption, we obtain 
Pi+ 1 = (1 - p)q ?~ 1 +p- (1 — (1 - qi) d ~ l ). To see this, observe that the message 1 is 
passed from message node v to check node c iff one of these two cases occurs: (a) 
the message node was received in error (with probability p) and at least one of the 
incoming messages is a 1 (with probability 1 — (1 — qi) d ~ l ), or (b) the message was 
received correctly (probability 1 — p) and all incoming messages are 1 (probability 
q d ~ l ). To assess the evolution of qi in terms of pi , note that a check node c sends 
a message 1 to message node v at round i iff the addition mod 2 of the incoming 
messages from message nodes other than v in the previous round is 0. Each such 




LDPC Codes: An Introduction 



95 



message is 1 with probability p*, and the messages are independent. Conditioned 
on the event that the check node is of degree £, there are i — 1 such messages. The 
probability that their addition mod 2 is 1 is = (1 — (1 — 2 pi ) e ~ 1 )/2 (why?). 

These recursions are for the conditional probabilities, conditioned on the 
degrees of the nodes. Introducing the generating functions A(x) and p(x) as above, 
we obtain the following recursion for the probabilities themselves: 



Pi + 1 = (1 - p) ■ A 



2 Pi ) 



+ p- 1 - A 



1 + P(1 - 2 Pi) 



( 6 . 1 ) 



If A(x), p(x ), and p are such that pi is monotonically decreasing, then decoding 
will be successful asymptotically with high probability, as long as the independence 
assumption is valid. 

For example, consider a (3, 6)-biregular graph. In this case X(x) = x 2 and 
p(x) = x 5 , and the condition becomes 



(i -p) • 



1 - (1 - 2xf 




l + (l-2*) 5 ^ < 



for x G (0 ,p). A numerical calculation shows that the best value for p is around 
0.039. 

By Shannon’s theorem the maximum error probability that a code of rate 
1/2 can correct is the maximum p such that 1 + plog 2 (p) + (1 — p) log 2 (l — p) = 
0.5. A numerical approximation shows that p is around 11%, which means that 
the Gallager A algorithm on the biregular (3, 6)-graph is very far from achieving 
capacity. Bazzi et al. [2] show that for rate 1/2 the best graph for the Gallager 
A algorithm is the biregular (4, 8)-graph for which the maximum tolerable error 
probability is roughly 0.0475 - still very far from capacity. This shows that this 
algorithm, though simple, is very far from using all the information that can be 
used. 



Gallager’s algorithm B is slightly more powerful than algorithm A. In this 
algorithm, for each degree j and each round i there is a threshold value bij (to 
be determined) such that at round i for each message node v and each adjacent 
check node c, if at least bij neighbors of v excluding c sent the same information 
in the previous round, then v sends that information to c; otherwise v sends its 
received value to c. The rest of the algorithm is the same as in algorithm A. 

It is clear that algorithm A is a special case of algorithm B, in which bij = 
j — 1 independent of the round. 

This algorithm can be analyzed in the same manner as algorithm A, and a 
recursion can be obtained for the probability pi that a message node is sending 




96 



A. Shokrollahi 



the incorrect information to a check node at round i : 



Pi + 1 

3> 1 



j-1 

p E 

t=bi,j 



j - 1 



l + p(l-2pi) 



1 - p(l - 2pi) 



1 j-l-i 



(j - 1\ 


1 — — 2 Pi)' 


t 


1 -f p( 1 - 2 pi) 


j-i-t 


V t ) 


2 




2 





+ (i-p) E 

t=bij 

where the value of bij is the smallest integer that satisfies 

1 26i. j — j-f- 1 



1 - P 



< 



1 + P(1 -2pi) 



|_1 -p(l -2p z )J 

(See [18] for details of the analysis.) 

For another hard decision decoder on the BSC (called “erasure decoder”), 
see [31]. 

The above one parameter recursions can be used to design codes that asymp- 
totically perform very well for a given amount of noise. The method of choice in 
these cases is linear programming. For details I refer the reader to [17, 18]. 



7. Completing the Analysis: Expander Based Arguments 

Density evolution and its instantiations are valid only as long as the incoming 
messages are independent. The messages are independent for t rounds only if the 
neighborhoods of depth t around the message nodes are trees. But this immediately 
puts an upper bound on t (of the order log(n), where n is the number of message 
nodes, see Section 9). But this number of rounds is usually not sufficient to prove 
that the decoding process corrects all errors. A different analysis is needed to 
complete the decoding. 

One property of the graphs that guarantees successful decoding is expansion. 
A bipartite graph with n message nodes is called an (a, (3 ) -expander if for any 
subset 5 of the message nodes of size at most an the number of neighbors of S 
is at least f3 • as • |S|, where as is the average degree of the nodes in S. In other 
words, if there are many edges going out of a subset of message nodes, then there 
should be many neighbors. 

Expansion arguments have been used by many researchers in the study of 
decoding codes obtained from graphs [38, 37, 35, 36]. Later, [17, 18] used expander 
based arguments to show that the erasure correction algorithm on the BEC and 
Gallager’s hard decision decoding algorithm will decode all the erasures/errors if 
the fraction of errors is small and the graph has sufficient expansion. Burshtein 
and Miller [5] generalized these results to general message passing algorithms. 

To give the reader an idea of how these methods are used, I will exemplify 
them in the case of the BEC. Choose a graph with edge degree distributions given 
by A(x) and p{x) at random. The analysis of the belief propagation decoder for 
the BEC implies that if condition (5.1) is true, then for any e > 0 there is an no 




LDPC Codes: An Introduction 



97 



such that for all n > no the erasure decoder reduces the number of erased message 
nodes below en. The algorithm may well decode all the erasures, but the point is 
that the analysis does not guarantee that. 

To complete the analysis of the decoder, we first note the following fact: if 
the random graph is an (e, l/2)-expander, then the erasure decoding algorithm 
recovers any set of en or fewer erasures. Suppose that this were not the case and 
consider a minimal counterexample consisting of a nonempty set S of erasures. 
Consider the subgraph induced by 5, and denote by T(S) the set of neighbors of 
S. No node in T(5) has degree 1, since this neighbor would recover one element 
in 5 and would contradict the minimality of S. Hence, the total number of edges 
emanating from these nodes is at least 2|T(5)|. On the other hand, the total 
number of edges emanating from 5 is as • |S|, so as • |5| > 2|r(S)| which implies 
|T(S)| < as - \S\/2 and contradicts the expansion property of the graph. 

In [17] it is shown that for a random bipartite graph without message nodes 
of degree one or two there is a constant e depending on the rate of the induced code 
and on the degrees of the message nodes such that the graph is an (e, 1 /2)-expander 
with high probability. On random graphs without message nodes of degrees one 
or two we see that the erasure decoding algorithm succeeds with high probability 
provided condition (5.1) is satisfied. 



8. Achieving Capacity 

Recall Shannon’s theorem which states the existence of codes that come arbitrarily 
close to the capacity of the channel when decoded with maximum likelihood de- 
coding. LDPC codes were designed to have decoding algorithms of low complexity, 
such as belief propagation and its variants. But how close can we get to capacity 
using these algorithms? 

There is no satisfactory answer to this question for arbitrary channels. What 
I mean by a satisfactory answer is an answer to the question whether subclasses 
of LDPC codes, for example LDPC codes with an appropriate degree distribution, 
will provably come arbitrarily close to the capacity of the channel. Optimization 
results for various channels, such as the Additive White Gaussian Noise (AWGN) 
channel and the BSC have produced specific degree distributions such that the 
corresponding codes come very close to capacity, see [29, 8]. 

We call an LDPC code £-close for a channel C with respect to some message 
passing algorithm if the rate of the code is at least fn)(C) — e and if the message 
passing algorithm can correct errors over that channel with high probability. We 
call a sequence of degree distributions (A ( n \x) , p( n \x)) capacity- achieving over 
that channel with respect to the given algorithm if for any e there is some no 
such that for all n > no the LDPC code corresponding to the degree distribution 
(A (x), p( n ) (x)) is £-close to capacity. Using this notation, the following question 
is open: 




98 



A. Shokrollahi 



Is there a nontrivial channel other than the BEC and a message passing algorithm 
for which there exists a capacity- achieving sequence (A ( n \x) , p( n \x)) ? 



I believe that this question is one of the fundamental open questions in the 
asymptotic theory of LDPC codes. 

In [17] the authors describe capacity- achieving sequences for the BEC for any 
erasure probability p. Let e > 0 be given, let D f 1 /e~| , and set 



A(z) 



Ei’ P(*) = e ' 



H( D ) £ < ’ 



_ p a(x-l) 



where a = H(D)/p. (Technically, p(x) cannot define a degree distribution since it 
is a power series and not a polynomial. But the series can be truncated to obtain 
a function that is arbitrarily close to the exponential.) We now apply (5.1): 



pA(l - p( 1 - x)) 



< H(D) 

ap 

H(D) X 
= x. 



ln(p(l - x)) 



This shows that a corresponding code can decode a p-fraction of erasures with 
high probability. 2 

What about the rate of these codes? Above, we mentioned that the rate 
of the code given by the degree distributions X(x) and p(x) is at least 1 — 
Jq 1 p(x) dx/ fo X(x) dx. In our case, this lower bound equals 1 — p(l + l/D)(l— e~ a ) 
which is larger than 1 — p(l -f s). 

The degree distribution above is called the Tornado degree distribution and 
the corresponding codes are called Tornado codes. These codes have many appli- 
cations in computer networking which I will not mention here (see, e.g., [6]). 

Tornado codes were the first class of codes that could provably achieve the 
capacity of the BEC using belief propagation. Since then many other distributions 
have been discovered [34, 27]. The latter paper also discusses general methodologies 
for constructing such degree distributions, and also discusses optimal convergence 
speeds to capacity. 



9. Graphs of Large Girth 

As is clear from the previous discussions, if the smallest cycle in the bipartite 
graph underlying the LDPC code is of length 2£, then independence assumption is 
valid for £ rounds of belief propagation. In particular, density evolution describes 

2 Actually, as was discussed before, (5.1) only shows that the fraction of erasures can be reduced 
to any constant fraction of the number of message nodes. To show that the decoding is successful 
all the way to the end, we need a different type of argument. Expansion arguments do not work 
for the corresponding graphs, since there are many message nodes of degree 2. For a way to 
resolve these issues, see [17]. 




LDPC Codes: An Introduction 



99 



the expected behavior of the density functions of these messages exactly for this 
number of rounds. 



The girth of a graph is defined as the length of the smallest cycle in the graph. 
For bipartite graphs the girth is necessarily even, so the smallest possible girth is 
4. It is easy to obtain an upper bound for the girth of a biregular bipartite graph 
with n message nodes of degree d and r check nodes of degree k : if the girth is 2£, 
then the neighborhood of depth i — 1 of any message node is a tree with a root of 
degree d, and in which all nodes of odd depth have degree k — 1 , while all nodes of 
even depth have degree d — 1 (we assume that the root of the tree has depth 0). 
The number of nodes at even depths in the tree should be at most equal to the 
message nodes, while the number of nodes at odd depths in the tree should be at 
least equal to the check nodes. The number of nodes at even depths in the tree 
equals 1 for depth 0, d(k — 1) for depth 2, d(k — 1 )D for depth 4, d(k — 1 )D 2 for 
depth 6, etc., where D = (d—l)(k — l). The total number of nodes at even depths 
is equal to 



1 + d(k — 1) 



D liJ - 1 

D — 1 ' 



This number has to be less than or equal to n, the number of message nodes. This 
yields an upper bound on 2£, the girth of the graph. The bound has order log D (n). 
Similar bounds can be obtained by considering nodes of odd depths in the tree. 

A similar bound as above can also be deduced for irregular graphs [1], but I 
will not discuss it here. 

As I said before, graphs of large girth are interesting because of the accuracy 
of belief propagation. However, this is not interesting for practical purposes, since 
for obtaining accuracy for many rounds the girth of the graph has to be large 
which means that the number of nodes in the graph has to be very large. 

There are other reasons to study graphs of large girth, however. From the 
point of view of combinatorics graphs of large girth which satisfy (or come close 
to) the upper bound on the girth are extremal objects. Therefore, to construct 
them, methods from extremal graph theory need to be applied. From the point of 
view of coding theory eliminating small cycles is very similar to eliminating words 
of small weight in the code. This is because a word of weight d leads to a cycle of 
length 2d or less. (Why?) 

How does one construct bipartite graphs of large girth? There are a number 
of known techniques with origins in algebra and combinatorics. For example, it 
is very easy to construct optimal bipartite graphs of girth 6. Below I will give 
such a construction. Let C be a Reed-Solomon code of dimension 2 and length 
n over the field ¥ q . By definition, this code has q 2 codewords and the Hamming 
distance between any two distinct codewords is at least n— 1. From C we construct 
a bipartite graph with q 2 message nodes and nq check nodes in the following 
way: The message nodes correspond to the codewords in C. The check nodes are 
divided in groups of q nodes each; the nodes in each such group corresponds to the 
elements of ¥ q . The connections in the graph are obtained as follows: A message 




100 



A. Shokrollahi 



node corresponding to the codewords (x \ , . . . , x n ) is connected to the check nodes 
corresponding to x\ in the first group, to £2 in the second group, . . ., to x n in 
the last group. Hence, all message nodes have degree n, and all check nodes have 
degree g, and the graph has in total nq 2 edges. Suppose that this graph has a 
cycle of length 4. This means that there are two codewords (corresponding to the 
two message nodes in the cycle) which coincide at two positions (corresponding to 
the two check nodes in the cycle). This is impossible by the choice of the code (7, 
which shows that the girth of the graph is at least 6. To show the optimality of 
these graphs, we compare the number of check nodes to the above bound. Let 2£ 
denote the girth of the graph. If i = 3, then 1 -f n(q - 1) < q 2 , which shows that 
n < q + 1. By choosing n = q T 1 we obtain optimal graphs of girth 6. 

If n < q + 1, the graphs obtained may not be optimal, and their girth may 
be larger than 6. For n ^ 2 it is easy to see that the girth of the graph is indeed 
6. For n = 2 the girth is 8. (A cycle of length 6 in the graph corresponds to three 
codewords such that every two coincide in exactly one position. This is possible 
for n > 2, and impossible for n = 2.) 

There are many constructions of graphs without small cycles using finite 
geometries, but these constructions are for the most part not optimal (except for 
cases where the girth is small, e.g., 4, or cases where the message nodes are of 
degree 2). 

The sub-discipline of combinatorics dealing with such questions is called ex- 
tremal combinatorics. One of the questions studied here is that of existence of 
graphs that do not contain a subgraph of a special type (e.g., a cycle). I will not 
go deeper into these problems here and refer the reader to appropriate literature 

(e-g.,[3]). 

A discussion of graphs of large girth is not complete without at least men- 
tioning Ramanujan graphs which have very large girth in an asymptotic sense. I 
will not discuss these graphs at all in this note and refer the reader to [22, 15]. For 
interesting applications of these graphs in coding theory I refer the reader to [33] . 



10. Encoding Algorithms 

An encoding algorithm for a binary linear code of dimension k and block length 
n is an algorithm that computes a codeword from k original bits aq, . . . , £&. To 
compare algorithms against each other, it is important to introduce the concept 
of cost, or operations. For the purposes of this note the cost of an algorithm is the 
number of arithmetic operations over F2 that the algorithm uses. 

If a basis <71 , . . . , for the linear code is known, then encoding can be done 

by computing x\g\-\ \~Xk9k- If the straightforward algorithm is used to perform 

the computation (and it is a priori not clear what other types of algorithms one 
may use), then the number of operations sufficient for performing the computation 
depends on the Hamming weights of the basis vectors. If the vectors are dense, 




LDPC Codes: An Introduction 



101 



X 1 

x 2 

x 3 

X4 

x 5 

x 6 
X 7 
x 8 
x 9 
*10 



2/1 = *1 + x 2 + X3 + x 4 + x 6 + cc 8 + xiq 

2/2 = X 1 + x 3 + x 4 + x 7 + *8 + x 9 + ^10 

2/3 = x 2 + x 4 + x 8 

2/4 = Xi -(- £5 -f- Xj + Xg + Xg -(- CC 10 

2/5 = £3 + £4 + x 5 + *7 + xg 



Figure 2. Construction with fast encoder 



then the cost of the encoding is proportional to nk. For codes of constant rate, 
this is proportional to n 2 , which may be too slow for some applications. 

Unfortunately LDPC codes are given as the null space of a sparse matrix, 
rather than as the space generated by the rows of that matrix. For a given LDPC 
code it is highly unlikely that there exists a basis consisting of sparse vectors, so 
that the straightforward encoding algorithm uses a number of operations that is 
proportional to n 2 . However, we would like to design algorithms for which the 
encoding cost is proportional to n. 

At this point there are at least two possible ways to go. One is to consider 
modifications of LDPC codes which are automatically equipped with fast encoding 
algorithms. The other is to try to find faster encoding algorithms for LDPC codes. 
I will discuss both these approaches here, and outline some of the pro’s and con’s 
for each approach. 

One simple way to obtain codes from sparse graphs with fast encoding is 
to modify the construction of LDPC codes in such a way that the check nodes 
have values, and the value of each check node is the addition of the values of its 
adjacent message nodes. (In such a case, it would be more appropriate to talk 
about redundant nodes, rather than check nodes, and of information nodes rather 
than message nodes. But to avoid confusion, I will continue calling the right nodes 
check nodes and the left nodes message nodes.) Figure 2 gives an example. The 
number of additions needed in this construction is upper bounded by the number 
of edges. So, efficient encoding is possible if the graph is sparse. The codewords 
in this code consist of the values of the message nodes, appended by the values of 
the check nodes. 





102 



A. Shokrollahi 



This construction leads to a linear time encoder, but it has a major problem 
with decoding. I will exemplify the problem for the case of the BEC. First, it 
is not clear that the belief propagation decoder on the BEC can decode all the 
erasures. This is because the check nodes can also be erased (in contrast to the 
case of LDPC codes where check nodes do not have a value per-se, but only keep 
track of the dependencies among the values of the message nodes). This problem 
is not an artifact of the non-optimal belief propagation decoder. Even the error 
probability of the maximum likelihood decoder is lower bounded by a constant in 
this case. Let me elaborate. Suppose that a codeword is transmitted over a BEC 
with erasure probability p. Then an expected p-fraction of the message nodes and 
an expected p - fraction of the check nodes will be erased. Let be the fraction 
of message nodes of degree d. Because the graph is random, a message node of 
degree d will have all its neighbors in the set of erased check nodes with probability 
p d . This probability is conditioned on the event that the degree of the message 
node is d. So, the probability that a message node has all its neighbors within 
the set of erased check nodes is A dP d , which is a constant independent of the 
length of the code. Therefore, no algorithm can recover the value of that message 
node. 

In [36] and [18] the following idea is used the to overcome this difficulty: 
the redundant nodes will be protected themselves with another graph layer to 
obtain a second set of redundant nodes; the second set will be protected by a third 
set, etc. This way a cascade of graphs is obtained rather than a single graph. At 
each stage the number of message and check nodes of the graphs decreases by a 
constant fraction. After a logarithmic number of layers the number of check nodes 
is small enough so the check nodes can be protected using a sophisticated binary 
code for which we are allowed to use a high-complexity decoder. Details can be 
found in [17]. If any single graph in the cascade is such that belief propagation 
can decode a p - fraction of errors, then the entire code will have the same property, 
with high probability (provided the final code in the cascade has that property, but 
this can be adjusted). All in all, this construction provides linear time encodable 
and decodable codes. 

The idea of using a cascade, though appealing in theory, is rather cumbersome 
in practice. For example, in the case of the BEC, the variance of the fraction of 
erasures per graph-layer will often be too large to allow for decoding. Moreover, 
maintaining all the graphs is rather complicated and may lead to deficiencies in 
the decoder. (For some ideas on how to decrease these deficiencies, see [17].) 

Another class of codes obtained from sparse graphs and equipped with fast 
encoders are the Repeat- Accumulate (RA) codes of Divsalar et al. [11]. The con- 
struction of these codes is somewhat similar to the construction discussed above. 
However, instead of protecting the check nodes with another layer of a sparse 
graph, the protection is done via a dense graph, and the check nodes of the first 
graph are never transmitted. Dense graphs are in general not amenable to fast 
encoding. However, the dense graph chosen in an RA code is of a special structure 
which makes its computation easy. 




LDPC Codes: An Introduction 



103 




Figure 3. An irregular RA code. The left nodes are the informa- 
tion symbols, and the rightmost nodes are the redundant nodes. 
The squares in between are check nodes. Their values are com- 
puted as the addition of the values of their neighbors among the 
information nodes. The values of the redundant nodes are cal- 
culated so as to satisfy the relation that the values of the check 
nodes is equal to the addition of the values of the neighboring 
redundant nodes. 



More formally, the encoding process for RA codes is as follows. Consider an 
LDPC code whose graph has n message nodes and r check nodes. The value of the 
r check nodes is computed using the procedure introduced above, i.e., the value 
of each check node is the addition of the values of its adjacent message nodes. 
Let (yi, . . . , y r ) denote the values of these check nodes. The redundant values 
(s 1 , . . . , s r ) are now calculated as follows: s 1 = y ll s 2 = s\ +y 2 , • • • , s r = s r _i +y r . 
(This explains the phrase “accumulate.”) An example of an RA code is given in 
Figure 3. 

The original RA codes used a (1, Afybiregular graph for some k as the graph 
defining the LDPC code. (This explains the phrase “repeat.”) RA codes were 
generalized to encompass irregular RA codes for which the underlying graph can 
be any bipartite graph [14]. The same paper introduces degree distributions for 
which the corresponding RA codes achieve capacity of the BEC. 

We conclude this section by mentioning the work of Richardson and Ur- 
banke [32] which provides an algorithm for encoding LDPC codes. They show that 
if the degree distribution (A (x),p(x)) is such that p( 1 - A(x)) < x for x e (0, 1), 



104 



A. Shokrollahi 



and such that Xzp' (1) > 1, then the LDPC code can be encoded in linear time. 
The condition A 2 p'(l) > 1 has the following interpretation: consider the graph 
generated by the message nodes of degree 2. This graph induces a graph on the 
check nodes, by interpreting the message nodes of degree 2 as edges in that graph 
(see Section 12). Then A 2 p'(l) > 1 implies that this induced graph has a connected 
component whose number of vertices is a constant fraction of the number of check 
nodes. This will be explained further in Section 12, where this condition is actually 
used to devise a linear time encoding algorithm for a certain type of graphs. I will 
not discuss the result of Richardson and Urbanke further, and will refer the reader 
to [32]. 



11. Finite-Length Analysis 

Density evolution gives a somewhat satisfactory answer to the asymptotic per- 
formance of random LDPC codes with a given degree distribution. It is possible 
to refine the analysis of density evolution to obtain upper bounds on the error 
probability of the decoder in terms of the degree distributions, and in terms of 
the number of message and check nodes. However, these bounds are very poor 
even when the number of message nodes is several tens of thousands large. This is 
primarily due to two reasons: density evolution is only valid as long as the neigh- 
borhood around message nodes is a tree. For small graphs this corresponds to a 
very small number of iterations, which is usually too small to reduce the fraction 
of errors in the graph to a reasonable amount. The second source of inaccuracy 
for the error probability is the set of tools used, since the bounds obtained from 
the probabilistic analysis are too weak for small lengths. 

For these reasons it is important to develop other methods for analyzing the 
performance of message passing algorithms on small graphs. So far this has only 
started for the case of the BEC [10]. In this case the analysis is of a combinatorial 
flavor. Given a bipartite graph, its associated code, and a set of erasures among the 
check nodes, consider the graph induced by the erased message nodes. A stopping 
set in this graph is a set of message nodes such that the graph induced by these 
message nodes has the property that no check node has degree one. The number 
of message nodes in the stopping set is called its size. It should be clear that 
belief propagation for the BEC stops prematurely (i.e., without recovering all the 
message nodes) if and only if this subgraph has a stopping set. Figure 4 gives some 
examples of graphs that are themselves stopping sets. Since unions of stopping sets 
are stopping sets, any finite graph contains a unique maximal stopping set (which 
may be the empty set). For a random bipartite graph the probability that belief 
propagation on the BEC has not recovered £ message nodes at the point of failure 
(£ can be zero) is the probability that the graph induced by the erased message 
nodes has a maximal stopping set of size £. 

Besides [10] several papers discuss finite- length analysis of LDPC codes on 
the BEC [25, 26, 30]. 




LDPC Codes: An Introduction 



105 






Figure 4. Examples of stopping sets 



I am not aware of similar analysis tools for channels other than the BEC. 
Generalizing the concept of stopping sets to other channels would certainly be a 
worthwhile effort. A recent paper by Feldman et al. [12] gives a different analysis 
tool using a linear programming relaxation of the decoding problem. 



12. An Example 

In this section I will exemplify most of the above concepts for a special type 
of LDPC codes. The codes I will describe in this section certainly do not stand 
out because of their performance. However, it is rather easy to derive the main 
concepts for them and this warrants their discussion in the framework of this 
note. Moreover, it seems that a thorough understanding of their behavior is very 
important for understanding belief propagation for general LDPC codes. I will try 
to clarify this more at the end of the section. 

For given n and r let P(n, r) denote the ensemble of bipartite graphs with n 
message nodes and r check nodes for which each message node has degree 2 and 
its two neighbors among the check nodes are chosen independently at random. 
The check node degrees in such a graph are binomially distributed, and if n and r 
are large, then the distribution is very close to a Poisson distribution with mean 
2 n/r. (This is a well-known fact, but the reader may try to prove it for herself.) 
It turns out that the edge degree distribution of the graph is very close to e a ( x_1 ) 
where a = 2n/r is the average degree of the check nodes. 

First, let us see how many erasures this code can correct. The maximum 
fraction of erasures is 1 — R, where R is the rate, which is at least 1 — r/n. We 
should therefore not expect to be able to correct more than an r/n - fraction of 
erasures, i.e., more than a 2/a-fraction. We now apply Condition (5.1): p is the 
maximum fraction of correctable erasures iff p • (1 - e~ ax ) < x for x E (0 ,p). 
Replacing x by px, this condition becomes 

1 - e -/3x < x, (3 = pa. 



( 12 . 1 ) 




106 



A. Shokrollahi 




Figure 5. A graph with left degree 2 and its induced graph on 
the check nodes 



This latter condition has an interesting interpretation: the graph induced by the 
p-fraction of erasures is a random graph in the ensemble P(e,r), where e is the 
number of erasures, the expected value of which is en . For this graph the edge 
distribution from the point of view of the check nodes is e _pa ( x_1 ), and thus (5.1) 
implies (12.1). 

Next, I will show that the maximum value of (3 for which Condition (12.1) 
holds is (3 = 1. For the function 1 — e~^ x — x to be less than 0 in (0, 1), it is necessary 
that the derivative of this function be non-positive at 0. The derivative is f3e~^ x — 1, 
and its value at 0 is f3 — 1. Hence, (3 < 1 is a necessary condition for (12.1) to 
hold. On the other hand, if (3 = 1, then (12.1) is satisfied. Therefore, the maximum 
fraction of correctable erasures for a code in the ensemble P (n,r) is r/(2n), i.e., 
the performance of these codes is at half the capacity. So, the ensemble P(n, r) is 
not a very good ensemble in terms of the performance of belief propagation on the 
BEC. 

Before I go further in the discussion of codes in the ensemble P(n, r), let me 
give a different view of these codes. A bipartite graph with n message nodes and 
r check nodes in which each message node has degree 2 defines a (multi-) graph on 
the set of check nodes by regarding each message node as an edge in the graph in 
the obvious way. Multi-graphs and bipartite graphs with message degree 2 are in 
one-to-one correspondence to each other. In the following we will call the graph 
formed on the check nodes of a bipartite graph G with message degree 2 induced 
by G. Figure 5 gives an example. 

For a graph in the ensemble P(n, r) the corresponding induced graph is a 
random graph of type G>. n , where Gm.E denotes the random graph on m vertices 
in which E edges are chosen randomly and with replacement among all the possible 
(™) edges in the graph. 




LDPC Codes: An Introduction 



107 



For a bipartite graph G with message degree 2 the stopping sets are precisely 
the edges of a 2-core. Let me define this notion: For any graph and any integer k 
the k - core of the graph is the unique maximal subgraph of G in which each node 
has degree k. The k - core may of course be empty. 

It is a well-known fact [4] that a giant 2-core exists with high probability in 
a random graph with E edges in m nodes iff the average degree of a node is larger 
than 1, i.e., iff E > m. (A giant 2-core in the graph is a 2-core whose size is linear 
in the number of vertices of the graph.) Condition (12.1) is a new proof for this 
fact, as it shows that if the average degree of the induced graph is smaller than 1, 
then with high probability the graph does not contain a 2-core of linear size. It is 
also a well-known fact that this is precisely the condition for the random graph to 
contain a giant component , i.e., a component with linearly many nodes. Therefore, 
condition (12.1) can also be viewed as a condition on the graph not having a large 
component. (This condition is even more precise, as it gives the expected fraction 
of unrecovered message nodes at the time of failure of the decoder: it is p times 
the unique root of the equation 1 — x — e~^ x in the interval (0, 1); incidentally, this 
is exactly the expected size of the giant component in the graph, as is well known 
in random graph theory [4].) 

More generally, one can study graphs from the ensemble C(n , r, p(x)) denoting 
random graphs with n message and r check nodes with edge degree distribution 
on the check side given by p(x) = YldPdX d ~ l (i.e., probability that an edge is 
connected to check node of degree d is pa). The maximum fraction of tolerable 
erasures in this case is the supremum of all p such that 1 — p(l — px) - x < 0 
for x G (0, 1). This yields the stability condition pp f ( 1) < 1. This condition is also 
sufficient, since it implies that pp'(l — px) < 1 on (0, 1), hence 1 — p{ 1 — px) — x is 
monotonically decreasing, and since this function is 0 at x = 0, it is negative for 
x G (0, 1). 

The condition pp'( 1) < 1 is equivalent to the statement that the graph in- 
duced on the check nodes by the bipartite graph has a giant component. This 
follows from results in [23]. According to that paper, if a graph is chosen randomly 
on n nodes subject to the condition that for each d the fraction of nodes of degree 
d is essentially Rd (see the paper for a precise definition), then the graph has al- 
most surely a giant component iff J2 d d(d — 2)Rd > 0. Consider the graph obtained 
from the restriction of the message nodes to a p-fraction, and consider the graph 
induced by this smaller graph on the check nodes. Then, it is not hard to see that 
the degree distribution for this graph is R(px + 1 — p), where R(x) = cf p(x) d 
and c is the average degree of the check nodes in the smaller bipartite graph. 
Therefore, the condition in [23] for the induced graph to have a giant component 
equals pii"(l) < c, where R"(x) is the second derivative of R(x). This is precisely 
equal to pp'( 1) < 1, i.e., the stability condition is equivalent to the statement that 
the induced graph has a giant component. Incidentally, this is also equivalent to 
the condition that the graph does not have a giant 2-core, since stopping sets are 
equivalent to 2-cores in this setting. The fraction of nodes in the giant 2-core (if 
it exists) is equal to the unique solution of the equation 1 — p(l — px) — x = 0 in 




108 



A. Shokrollahi 



(0, 1). (Compare this also to [24], which obtains formulas for the size of the giant 
component in a random irregular graph.) 

LDPC codes from graphs with left degree 2 play an important role. For ex- 
ample, consider the stability condition proved in [29]. It states that small amounts 
of noise are correctable by belief propagation for an LDPC code with degree dis- 
tribution given by X(x) and p(x) if and only if A 2 //(l) < ( f(x)e~ x ^ 2 dx^ , 
where f(x) is the density of the log- likelihood of the channel. For example, for the 
BEC with erasure probability p we obtain A2p / (1) < 1/p, and for the BSC with 
error probability p we obtain A 2 p'(l) < 1 /y/p(l — p)- The stability condition is ac- 
tually the condition that belief propagation is successful on the subgraph induced 
by message nodes of degree 2 (see [9]). This is not surprising, since these message 
nodes are those that are corrected last in the algorithm. (I do not give a proof 
of this, but this should sound reasonable, since message nodes of degree 2 receive 
very few messages in each round of iteration, and hence get corrected only when 
all the incoming messages are reasonably correct.) 

Acknowledgment 

Many thanks to Frank Kurth, Mehdi Molkaraie, and Mahmoud Rashidpour for 
pointing out typos and proof-reading an earlier version of this paper. I would also 
like to thank the institute for Studies in Theoretical Physics and Mathematics 
(IPM), and especially Dr. Reza Khosrovshahi for their warm hospitality during 
my visit in November of 2002. 

References 

[1] N. Alon, S. Hoory, and N. Linial. The Moore bound for irregular graphs. To appear, 
2002. 

[2] L. Bazzi, T. Richardson, and R. Urbanke. Exact thresholds and optimal codes for 
the binary symmetric channel and Gallager’s decoding algorithm A. IEEE Trans. In- 
form. Theory , 47, 2001. 

[3] B. Bollobas. Extremal Graph Theory. Academic Press, 1978. 

[4] B. Bollobas. Random Graphs. Academic Press, 1985. 

[5] D. Burshtein and G. Miller. Expander graph arguments for message- passing algo- 
rithms. IEEE Trans. Inform. Theory , 47, 2001. 

[6] J. Byers, M. Luby, M. Mitzenmacher, and A. Rege. A digital fountain approach to 
reliable distribution of bulk data. In proceedings of ACM SIGCOMM 98 , 1998. 

[7] J.-F. Cheng, D. MacKay, and R. McEliece. Turbo decoding as an instance of Pearl’s 
belief propagation algorithm. IEEE J. Sel. Areas Comm., 16:140-152, 1998. 

[8] S-Y. Chung, D. Forney, T. Richardson, and R. Urbanke. On the design of low-density 
parity-check codes within 0.0045 dB of the Shannon limit. IEEE Communication 
Letters , 5:58-60, 2001. 

[9] L. Decresusefond and G. Zemor. On the error-correcting capabilities of cycle codes 
of graphs. Combinatorics, Probability, and Computing , 6:27-38, 1997. 




LDPC Codes: An Introduction 



109 



[10] C. Di, D. Proietti, E. Telatar, T. Richardson, and R. Urbanke. Finite- length analysis 
of low-density parity-check codes on the binary erasure channel. IEEE Trans. In- 
form. Theory , 48:1570-1579, 2002. 

[11] D. Divsalar, H. Jin, and R. McEliece. Coding theorems for ’Turbo- like’ codes. In 
Proceedings of the 1998 Allerton Conference , pages 201-210, 1998. 

[12] J. Feldman, D. Karger, and M. Wainwright. Using linear programming to decode 
linear codes. In Proceedings of the 37th Annual Conference on Information Sciences 
and Systems (CISS’03), 2003. 

[13] R. G. Gallager. Low Density Parity- Check Codes. MIT Press, Cambridge, MA, 1963. 

[14] H. Jin, A. Khandekar, and R. McEliece. Irregular repeat-accumulate codes. In Proc. 
2nd International Symposium on Turbo Codes , pages 1-8, 2000. 

[15] A. Lubotzky, R. Phillips, and P. Sarnak. Ramanujan graphs. Combinatorica , 
8(3):261-277, 1988. 

[16] M. Luby, M. Mitzenmacher, and A. Shokrollahi. Analysis of random processes via 
and-or tree evaluation. In Proceedings of the 9th Annual ACM-SIAM Symposium on 
Discrete Algorithms , pages 364-373, 1998. 

[17] M. Luby, M. Mitzenmacher, A. Shokrollahi, and D. Spielman. Efficient erasure cor- 
recting codes. IEEE Trans. Inform. Theory , 47:569-584, 2001. 

[18] M. Luby, M. Mitzenmacher, A. Shokrollahi, and D. Spielman. Improved low-density 
parity-check codes using irregular graphs. IEEE Trans. Inform. Theory, 47:585-598, 
2001. 

[19] M. Luby, M. Mitzenmacher, A. Shokrollahi, D. Spielman, and V. Stemann. Practical 
loss-resilient codes. In Proceedings of the 29 th annual ACM Symposium on Theory 
of Computing , pages 150-159, 1997. 

[20] D.J.C. MacKay. Good error-correcting codes based on very sparse matrices. IEEE 
Trans. Inform. Theory, 45:399-431, 1999. 

[21] D.J.C. MacKay and R.M. Neal. Good codes based on very sparse matrices. In Cryp- 
tography and Coding, 5th IMA Conference, number 1025 in Lecture Notes in Com- 
puter Science, pages 100-111, 1995. 

[22] G. A. Margulis. Explicit group-theoretic constructions of combinatorial schemes and 
their applications in the construction of expanders and concentrators. Problems of 
Information Transmission, 24(l):39-46, 1988. 

[23] M. Molloy and B. Reed. A critical point for random graphs with a given degree 
sequence. Random Structures and Algorithms, 6:161-179, 1995. can be downloaded 
from http://citeseer.nj .nec.com/molloy95critical.html. 

[24] M. Molloy and B. Reed. The size of the giant component of a random graph with 
a given degree sequence. Combin. Probab. Comput., 7:295-305, 1998. can be down- 
loaded from http://citeseer.nj .nec.com/molloy98size.html. 

[25] A. Orlitsky, R. Urbanke, K. Viswanathan, and J. Zhang. Stopping sets and the girth 
of tanner graphs. In Proceedings of the International Symposium on Information 
Theory, 2002. 

[26] A. Orlitsky and J. Zhang. Finite- length analysis of LDPC codes with large left 
degrees. In Proceedings of the International Symposium on Information Theory, 2002. 




110 



A. Shokrollahi 



[27] P. Oswald and A. Shokrollahi. Capacity- achieving sequences for the erasure channel. 
IEEE Trans. Inform. Theory , 48:3017-3028, 2002. 

[28] J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Infer- 
ence. Morgan Kaufmann Publishers, Inc., 1988. 

[29] T. Richardson, A. Shokrollahi, and R. Urbanke. Design of capacity-approaching ir- 
regular low-density parity-check codes. IEEE Trans. Inform. Theory , 47:619-637, 
2001. 

[30] T. Richardson, A. Shokrollahi, and R. Urbanke. Finite-length analysis of various 
low-density parity-check ensembles for the binary erasure channel. In Proceedings of 
the International Symposium on Information Theory , 2002. 

[31] T. Richardson and R. Urbanke. The capacity of low-density parity-check codes under 
message-passing decoding. IEEE Trans. Inform. Theory , 47:599-618, 2001. 

[32] T. Richardson and R. Urbanke. Efficient encoding of low-density parity-check codes. 
IEEE Trans. Inform. Theory , 47:638-656, 2001. 

[33] J. Rosenthal and P. Vontobel. Construction of LDPC codes using Ramanujan graphs 
and ideas from Margulis. In Proceedings of the 38th Allerton Conference on Com- 
munication, Control, and Computing , pages 248-257, 2000. 

[34] A. Shokrollahi. New sequences of linear time erasure codes approaching the channel 
capacity. In M. Fossorier, H. Imai, S. Lin, and A. Poli, editors, Proceedings of the 
13th International Symposium on Applied Algebra, Algebraic Algorithms, and Error- 
Correcting Codes , number 1719 in Lecture Notes in Computer Science, pages 65-76, 
1999. 

[35] M. Sipser and D. Spielman. Expander codes. IEEE Trans. Inform. Theory , 42:1710- 
1722, 1996. 

[36] D. Spielman. Linear- time encodable and decodab le error-correcting codes. IEEE 
Trans. Inform. Theory , 42:1723-1731, 1996. 

[37] M.R. Tanner. A recursive approach to low complexity codes. IEEE Trans. In- 
form. Theory , 27:533-547, 1981. 

[38] V.V. Zyablov and M.S. Pinsker. Estimation of error-correction complexity of Gal- 
lager low-density codes. Probl. Inform. Transm ., 11:18-28, 1976. 



Amin Shokrollahi 
Laboratoire d’algorithmique 
EPFL 

1015 Lausanne 
Switzerland 

and 

Digital Fountain, Inc. 

39141 Civic Center Drive 
Fremont, CA 94538 
USA 

e-mail: amin . shokrollahiOepf 1 . ch 
e-mail: amin@digitalfountain.com 




Contributed Papers 




Progress in Computer Science and Applied Logic, Vol. 23, 113-127 
© 2004 Birkhauser Verlag Basel/Switzerland 



The New Implementation Schemes of 
the TTM Cryptosystem Are Not Secure 

Jintai Ding and Dieter Schmidt 



Abstract. We show that the new TTM implementation schemes have a defect. 
There exist linearization equations 

n,m n m 

E a ij X iVj( X 1 , • ' • ’ X n) + E biXi + E C jyi( Xl > ' ■ > $ x n)+d = 0 , 

i— l,j = l i= 1 j - 1 

which are satisfied by the components yj(x i, . . . ,x n ) of the ciphers of the 
TTM schemes. The inventor of TTM used two versions of the paper [2] to 
refute a claim in [3]. When we do a linear substitution with the linear equa- 
tions derived from the linearization equations for a given ciphertext, we can 
find the plaintext by an iteration of the procedure of first search for linear 
equations by linear combinations and then linear substitution. The computa- 
tional complexity of the attack on these two schemes is less than 2 35 over a 
finite field of size 2 8 . 

Keywords. Open-key, multivariable, quadratic polynomials, linearization. 



1. Introduction 

Recently new methods were invented to construct multivariable cryptosystems, 
namely cryptosystems based on multivariable functions instead of single variable 
functions. The security of such systems in general relies on how difficult it is to 
solve polynomial equations with many variables, a proven NP-hard problem in 
general. 

Matsumoto and Imai suggested one of the first constructions of such cryp- 
tosystems [6] , which unfortunately has been defeated [8] . Another interesting one is 
the TTM cryptosystem [7], which was patented in the US in 1998 and is currently 
marketed by US Data Security Inc. (www.usdsi.com). This system is based on the 
idea of the composition of invertible polynomial maps, which is closely related to 
the famous Jacobian Conjecture. Despite the claim of the inventors that the TTM 
systems are very secure from all standard attacks, the authors of [3] claimed that 
they completely defeated all possible TTM schemes using the Minrank method 
and demonstrated it by defeating one of the challenges set by the inventors of 




114 



J. Ding and D. Schmidt 



TTM. However the inventors of TTM refuted the claim with [2], where they gave 
a new implementation scheme to support their claim. In [5], another method was 
found to defeat the first TTM implementation scheme in [7]. Though this new 
method can also be applied to other TTM implementation schemes [1], it can not 
be directly applied to all existing implementation schemes, such as the new ones 
in two versions of [2]. In this article, we will show that actually ail existing imple- 
mentation schemes for the TTM cryptosystem have a common defect that could 
make them insecure. For the case of the most recent two TTM implementation 
schemes in two different versions of the paper [2] , we use this defect to defeat the 
schemes. 

The key idea comes from an observation that we can also extend the lin- 
earization method by Patarin [8] to attack all TTM implementation schemes. 

In all TTM implementation schemes, a cipher F is made of m degree two 
polynomials of n variables over a finite field K of characteristic 2, namely, 

F(x = (yi(xi, . . . ,X n ), ■ ■ ■ iVmiXl, ■ ■ ■ ,Xn)), C 1 - 1 ) 

where m > n. These m polynomials yi are made public. The cipher F is given 
as a map from K n to K 171 and it is derived from F = 0 4 ° 03 ° 02 ° $i, where 
o denotes a composition of maps, 0 4 and 4>i are affine linear maps, 0 4 is an 
invertible map from K m to K m , 4>i is an injective map from K n to K m and 0 3 
and 02 are nonlinear maps of the de Jonquieres type on K m . Given an element 
X = (21 , . . . , z m ) in K m , a de Jonquieres map J(X) is defined as a map from K m 
to K m :J(X) = (z 1 +g 1 (z 2 , . . . , z m ), Z 2 + gi(z3, ■ • • , Z m ), Z m - 1 1 (^m)? Zm)-) 

where g % are polynomial functions. 

An affine multiple method uses an equation of the form 

n,m n m 

^ ^ dijXiyj {pc \ , . . . , x n ) -b ^ ^ biXi H- ^ ^ Cj y^ ( x \ , • • • ? x n ) d 0, (1*^) 

i—l,j=l i— 1 j=l 

which is satisfied by the set of polynomials yi of the cipher F and its variables X;. 
This equation, which we call ‘linearization equation’, was used first by Patarin to 
successfully attack the Matsumoto-Imai cryptosystems. 

From the construction of the TTM implementation schemes, we found that 
all existing TTM implementation schemes have a large number of linearization 
equations, which are satisfied by the quadratic polynomials yi of the TTM ci- 
pher F. For example, for the most recently proposed implementation scheme [2] 
(the revised version on IACR e- Archive, the former version has a different imple- 
mentation scheme), where m — n - b 52, we found all linearization equations and 
computed that the dimension of V is actually 347, where V is the linear space of 
all the linearization equations satisfied by the quadratic polynomials y % . 

This is the source of the common defect among all TTM implementation 
schemes. The existence of the linearization equations means that for a given ci- 
phertext (y [, . . . , y f m ), we can immediately produce some linear equations satisfied 
by the plaintext (x \ , . . . , x f n ), which is something that a secure open key cryptosys- 
tem should not have. For the case of the revised implementation scheme [2], we 




The New TTM Schemes Are Not Secure 



115 



found that, with the probability 1 — > 1 — 2 -82 , the linearization equations 

will produce 17 linearly independent linear equations satisfied by X{. 

For this case we can move one step further by performing a substitution of 
these 17 linear equations into y z , which makes y* quadratic polynomials with 17 
fewer variables, which we denote by (x Vl , . . . , x V31 ). Now F becomes a new map F 
from K n ~ 17 to K m , which in the composition form can be equivalently rewritten 
as: F = 04 o 0 2 o where 04, which is invertible, and ^i, which is injective, are 
some affine linear maps. The procedure of the substitution of the 17 linear equa- 
tions eliminates one of the composition factors of the de Jonquieres type. Then 
solving the equations F = (yj, . . . , y^J for the given ciphertext becomes straight- 
forward because of the triangular form of the de Jonquieres type of maps and it is 
accomplished by an iteration of the procedure of first search for linear equations by 
linear combinations and then linear substitution. Finally the plaintext can be de- 
rived by substituting the solution of the values of (x Vl , . . . , x V31 ) into the original 17 
linear equations. For the practical example m = 100 proposed in [2], we can show 
that it takes about 2 32 computations on a finite field of size 2 8 to defeat the scheme. 
We performed a computation example on a PC (450 MHz) and defeated it in a 
few hours. Similarly, we can defeat the TTM scheme in the original version of [2]. 

We arrange the paper in the following way. In Section 2, we will first discuss 
the basic idea of TTM. Then, we will present the details of our attacks on two 
different implementation schemes of the TTM: the first one is the one in the revised 
version (July 2002) of [2], the second one is the one suggested in the first version 
of (August 2001) [2]. In Section 3, we will present the conclusion. 

2. The Common Defect of the TTM Schemes 

2.1. Basic Technical Idea of the TTM Schemes 

Let F(x i, . . . , x m ) be a map on the space K m . It is a composition of several maps 
Gi on if m , i = 1, . . . , fc, F = Gi o G 2 o ■ • • o Gfc, and has the following properties: 

(I) F( Xl , . . . , x m ) is easy and fast to compute if we are given specific values for 
all Xi. 

(II) The factorization of F in terms of the composition of Gi is very difficult to 
compute if we only know the expanded version of F(x 1 , . . . , a? m ), that is, F -1 
is very difficult to compute without such a decomposition, and Gi are very 
easy to invert. 

With such an F(x 1 , . . . , x m ) and if the equation F(x 1 , . . . , x m ) = (ai, . . . , a m ) is 
impossible to solve directly, we can use F to build an open-key public cryptosystem. 
The Matsumoto-Imai construction [6] is an attempt of such a type of construction. 

For the TTM construction, one uses only the following two types of maps. 

1) The Linear Type: Given the space X m , we can apply all invertible affine linear 
maps to the m variables: f(X) = aX + 6, where a is a m x m invertible matrix, 
and X and b are in K m . 




116 



J. Ding and D. Schmidt 



2) The de Jonquieres Type: These maps give isomorphisms of the corresponding 
polynomial rings, which are called the tamed transformation in algebraic geome- 
try, and they can be easily inverted. TTM stands for the Tamed Transformation 
Method. 

However due to the consideration of the size of public key and the complexity 
of public computations, any practical and efficient system requires to have the 
polynomial components of the cipher to be of degree 2, which seems to be very 
difficult to accomplish. 

In [7], a quadratic construction is obtained by instead using the map 

F (xi , . . . , x n ) — F(xi , . . . , x n , 0 , 0 , . . . , 0) , 

where F(x i, . . . , x m ) = 0 4 o 0 3 o </> 2 0 0 ^X 1 , x 2 , . . . , x m ), 0 \ and 0 4 are of invertible 
linear type, 03, 0 2 are of the de Jonquieres type, 0 2 is of degree 2 and 0 3 is of a 
high degree (8). This map F, which can be viewed as a map from K n to F m , is 
an “invertible” map in the sense that it is injective, and given any element in the 
image of F, we can use F -1 to recover its preimage easily. 

The key component of the construction of the TTM systems is based on 
a special multivariable polynomial Qs(z \,. . . , zi) and a special set of quadratic 
polynomials qi(z \, . . . , 2 ^), i = 1, . . . , l , such that Qs{qu . . . ,qi) is still quadratic 
in Zi. Though the constructions of the TTM schemes are very interesting from 
a theoretical and a practical point of view, in particular from the point view of 
algebraic geometry, no principle was given about how Q& and q % are constructed. 
Our attack starts from an observation of a special property of the polynomials Q$ 
and qi. 



2.2. Cryptanalysis of the Revised Version of [2] 

2.2.1. The scheme. In this subsection, we will use essentially the notation in the 
revised version of [2] . 

First the finite field K is of size 2 8 , and m — n- 1-52. The map F is made of 
01, 02, 03, 04; F = 0 4 o 03 o 02 O 0i(xi,x 2 , • . . ,x n+52 ), which are maps from the 
(n-b52)-dimensional space into itself and is defined in [2]. 0i = (0i,i , * - - , 0i,n+52), 
04 = (04 ,i, . . • ,04,n+52) are invertible affine linear maps, and 0 h = x 2 , for i > n; 
0 2 and 03 are nonlinear maps of the de Jonquieres type. 

The map F(x i,...,x n ) = (2/i,...,2/ n+ 52 ) = 04 °03 °02 o0i(xi,...,x n ,O,...,O) = 
04 0 03 °02 °^i(xi,...,x n ) is the cipher, which is public, but 0i,04 are private. 
$i(xi, . . . , x n ) = 0i (x \ , x 2 , . . . , x n , 0, . . . , 0) is an injective map from K n to K n + 52 . 
In the expansion formula, the components y % of the map F are degree two poly- 
nomials of variables (xi, . . . , x n ). 

To attack this cryptosystem is to solve the set of equations yi{x \ , . . . , x n ) = y[ 
for i = l,...,m, with the variables x 3 , j = l,...,n and an element in K m \ 
(y [, . . . , y' n+ 52 )• Here (y [, ... , 2/^ +52 ) can be viewed as the ciphertext, and the so- 
lution (x'j, . . . , x' n ) G K n is the plaintext. 




The New TTM Schemes Are Not Secure 



117 



In [2] it is claimed that, if n = 48, (m = 100), no practical methods can work 
efficiently to attack such a system, in particular, the Minrank method in [3], and 
the complexity of the attack by Minrank method is far bigger than 2 84 . 

In this scheme, <j) 2 (x i, . . . , x n ) = (0 2 ,i, • • • , <£ 2 , 100 ) is given by 



<£2,1 



#l5 



02,2 #2 H“ fi(x 1 , • • • , #2— 1 ) , 




f = 2,3,. 


.,41; 


02, 2 = <Z2-4l(#38, • • • ,#48), 




i = 42, . . 


,48; 


02,2 = #2 + 0i-4l(#38,---,#48), 




II 


,76; 


02,2 = Xi + qi- 7 2 (#36 , #39 , #40 , • • 


• ,#45, #37, #47, #48), 


I'- 

ll 

• «s> 


-,84; 


02,2 = Qi— 80 (#34 , #39 , £40 , . . 


• ,^45, £35, £47, ^48), 


z = 85, . . 


.,92; 


02,2 = #2 4- Qi—gg (X 32 , #39 , #40 , • • 


• ,£45, £33, £47, #48), 


i = 93, . . 


. , 100 . 



where a\ and a 3 can be any nonzero number in the field K , 

<38(91, • • • ,935) = (95913 + 989l4)(9l9932 + 92(918 + 924)) 2 (9209l9 + 923918) 



+ (93293 + (918 + 924)92l) 2 X (922919 + 923924) (99913 + 98915) 
+ a l((925926 + 927928) (96929 + 97916) + (910930 + 9ll93l)(9l79l + 91894)) 
+ a l 2 (96933 + 93497 + 95935 + 914912), 

and 



9l = 2 4 2 2 + £»125, 

94 = 2427 + Ol^8, 

97 = Z2-Z9 +aiZn, 

9l0 = -21-27 + 0,1 Zg, 
913 = 23-211 + OiZio, 
916 = 22-210, 

919 = 2223 + 012:7, 
922 = 23 2 8 , 

925 = 2 6 Z8 + 0325, 
928 = 2 6 27 + 0322, 
931 = 2 : 7 Zio +Ol-2ll, 
934 = 2 8 2 i 0 , 



92 = 2324 + Oi2 6 , 

95 = 2i2 5 + Oi2 9 , 

98 = 2329 + 0l2i, 

911 = 2429 + 0121, 
914 = 2 5 2io + 01211, 
917 = 2 7 28 + Oi2 7 , 
920 = 2 5 2 8 + Oi 2 5 , 
923 = 2 3 2 5 +Ol2 8 , 
926 = 2 2 26, 

929 = 22 - 211 , 

932 2325 4“ 25 2g “l - 

935 = 2 7 2h + Oi2io, 



93 = 2225 + Ol2 7 , 

96 = 2l2 2 + Oi2io, 

99 = 2i2 3 , 

912 = 2 7 2 9 + Oi2i, 

915 = 23210, 

918 = 2 5 2 7 + Oi2 2 , 
921 = 2425 + Oi2 6 , 
924 = 2327, 

927 = 2526, 

930 = 2 4 2h + Oi2io, 
, 933 = 282n, 



fi(x 1, . . . , x,_i) are randomly chosen quadratic functions. 

4>3{xi,. . .,x n ) = (0 3) i, . . . ,03, 100) is given as: 0 3ji = Xi, i = 5, . . . , 100; and 
03,4 = £4 + Ri(xi , . . . ,xioo), i = 1, 2,3,4; where (aii, ... ,a:ioo) = Si PijPj 
are linearly independent and the Pi’s are given as follows: 

Pi Qs(x42i • • • , 3145, 3-101 — 81, • • • , 3-108— 8i, 3154, • • • , 3176), for i 1, 2, 3 
and P 4 = Qs{x 42 , . . . ,x 76 ). 



Remark In the new version of [2] , the polynomials Q$ and qi actually have three 
free parameters oi, 02 and a 3 . We checked the formulas and found out that in 
order to make the cipher F to be of degree 2, one must make o,\ equal to 02 - We 
impose this condition on this implementation scheme. 




118 



J. Ding and D. Schmidt 



Because of the specific form of 0 i, we can write: 

4>i(xi,x 2 , ■ . ■ ,x 48 ,0, . . .,0) = $i(xi, . . . ,148) = iro<j> 1 (xu-- -, 14 s), 
where n is the standard embedding that maps K 48 into K 1 00 : 

7t(xi,...,x 48 ) = (Xi,... ,X 48 ,0,0, . . . ,0), 

and (Ai(xi,...,x 48 ) = (0i.i(xi, . . . ,x 48 ), • • • , 0 i. 4 8 (zi, ■ ■ ■ ,x 48 )) is an invertible 
affine linear transformation from K 48 to itself. 

Let 0 3 o 0 2 o 7r = 032 , then 

F(xi, . . . ,x 48 ) = 0 4 o0 3 o0 2 o0 1 (x 1 ,X2,...,x 48 ,O,...,O) 

= 04 0 03 0 02 O7TO0i(xi, ...,x 48 ) 

= 0 4 O 032 O 0!(xi, . . . ,X 48 ). 



Let 032 (xi , . . . , x 48 ) = (032,1 , . . . , 032. ioo ) i then for the different values of the 
index 2 



032,2 


= Xi 


+ a\ 4 Pi 4 (x 38 X4s + x 47 x 46 ) 










+ a \ 4 Xa 0iji x 38-2jX48 T ^39-2j^47), 


* = i; 




032,2 


= Xi 


+ /i(Xi, . . . ,Xi_i) + a} 4 A 4 (x 38 X 4 8 + X 3 7X 46 ) 










+ a l 4 El Pi](x38-2jX48 + X 3 9_2jX 47 ), 


2 = 2,3, 4; 




032, 2 


= Xi 


+ fi(x 


i = 5,6,.. 


.,41; 


032,i 


= Qi 


-31 (^38, • • • , ^48 ) , 


i = 42, . . . 


,48; 


032, 2 


= Qi 


-3l(^38, • • • , ^48 ) , 


cT 

II 

•OS 


,76; 


032,i 


= Qi 


— 72 (^36 , Xgg , ^40 , • • • , *^45 , ^37 , *^47 , ^48 ) , 


2 = 77, . . . 


OO 


032 ,2 


= Qi 


-85^34, ^39,^40, • • • ,^45,^35,^47,^48), 


z = 85, . . . 


,92; 


032, 2 


= Qi- 


— 93 (^32 , ^39 , 2-40 , • • • , ^45 , ^33 , *^47 , ^48 ) , 


2 = 93, . . . 


,100. 



The formula above is due to the fact that Qg{q\, • • • ,< 735 ) — a } 4 (29210 + ^l^n), 
which is the reason why F is of degree 2. 

2.2.2. The basic idea of the cryptanalysis. Our attack starts from the observation 
that all qi are very simple quadratic polynomials, which have only one quadratic 
term. In this case, Qg has 35 variables and q x hats 11 variables, and we have 
q 9 = z\z 3, (/is = Z3Z10. This implies that 

21099 - 21915 = 0. (2.1) 

In this implementation scheme, the map 032 has actually 4 sets of q x as its 
components (with intersections) . Because F is derived from 032 by composing from 
both the left side and the right side by an invertible linear map, the equation (2.1) 
above implies that we must have linearization equations for the 2 /;, the components 
of F. This means there is a possibility to actually use such linearization equations 
to attack this scheme, which is the only method used by Patarin to defeat the 
Matsumoto-Imai scheme. 




The New TTM Schemes Are Not Secure 



119 



Let V denote the linear space of the linearization equations (1.2) satisfied by 
Hi of F and let D be its dimension. 

Let V denote the linear space of the linearization equations satisfied by 
032,i(zi, • • • ,z 48 ) of 032: 

n,m n m 

Y. aijXifa 2 ,j(xi, . . . ,x 48 ) + Y2biXi + y^Cj^ 32 ,j(x 1 ,. . . ,x 48 ) +<7=0, 

2=1, J = 1 2=1 j - 1 

and let D be the dimension of V\ 

Let 032 (# 1 , • • • ,^48) = (032,1, • • • , 032, 100 ) = 032 ° 01 (#l, • • • ,# 48 )- 
Let V denote the linear space of the linearization equations satisfied by 
^32,t(xi, . . . ,x 48 ) of 4> 32 : 

n,m n m 

Y, bijXi4> 3 2,j(Xl,. . .,x 48 ) + ^biXi + y^Cj^ 32 ,j(xi, . . . ,X 48 ) + d = 0, 
2=l,j=l 2=1 j = l 

and let D be the dimension of VL 

Let 04,* denote the components of 0 4 and 0i,; denote the components of 0i. 
Let ( 0)4 ^ denote the components of 0J 1 and (0)j~* denote the components of 0^ x . 
Let M be the map from V to V given by: 

M : {T,aijXi4) 3 2j(xi , . . . ,x 48 ) + £ 6 ^ + Ecj0 32 ,j(xi, . . . ,x 48 ) + d = 0) 
(£a^Xi( 0 ) 4 j(i/i(xi, . . . ,x 48 ), • • . , 2 /ioo(si, • • -,^48)) + £Mi + 
s ^'(0)r j(2/i( x i» • • • ^48 ), • . . , 2 / 100 ( 3 : 1 , . . . ,x 48 )) + d = 0). 

Let M be the map from V to V given by: 

M : (SaijXi^ 32 ,j(xi, . . . ,x 48 ) + E6jX, + H,Cj<j> 3 2 ,j(x i, . . . ,x 48 ) + d = 0) -> 

(Y,dij4>i,i(x i, . . . ,x 48 )^ 3 2 ,j(xi, . . . ,x 48 ) + S6i<^i,i(xi, . . . ,x 48 ) 

+Scj^ 32 ,j(xi, . . . , x 48 ) + d = 0). 

Theorem 1. M and M are invertible linear maps and D = D = D. 

The proof follows from the fact that both 0 4 are 0i are invertible affine linear 
maps. Essentially the map M is a change of basis of Xi and the map M is an affine 
linear transformation of the substitution of 032, * by y*. This means that we only 
need to find D to find D and we did so by computations. 

First we choose the field K to be K = Z 2 [x\/(x 8 + x 6 + x 5 + x + 1). Because 
a\ and a 3 can be any nonzero constants, we choose them both to be 1. Then we 
choose fi(x 1 , . . . , 1 ), i — 2, . . . , 41, randomly as quadratic polynomials over K 

and Pij randomly in K (but satisfying the condition Ri are linearly independent). 
We choose 10 different sets of fi(x i,...,x 48 ) and /3 ZJ for testing. For all these 
10 choices, our computation showed that the dimension D = 347 and that all 
linearization equations are of the form 

£i>31 £ji>41^ , 2j3'£032,ji (xi , . . . , X4g) -f- ^ii>3ib{Xi ~h £jf>41 Cj 4*32, j (^1 ? 5 3?4g) — 0, 




120 



J. Ding and D. Schmidt 



and the polynomials (j) 3 2.j(xi, • • • ,£48), for j > 41 depend only on the 17 variables 
£i, with z > 31. 

Though we have such a large number of linearization equations, we are not 
sure how many linearly independent equations they will produce for a set of given 
ciphertext y\. 

Let (x [, . . . ,x' 48 ) be an element in K 48 . Let y[ = yi{x [, . . . , x f 48 ) , 4> f 32 i = 
032,z(^i, • • • , x f 48 ). Let U be the space of linear equations derived from substitution 
of yi by the values y[ in V. Let U be the linear space of linear equations derived 
from substitution of </> 3 2 , i by the values 4> 32i i n V ■ Let U be the linear space of 
linear equations derived from substitution of </> 3 2 by the values (f) 32i and x* by 
(4>)i }( x in V. 

For a linear equation a l x t + b = 0, we define M to be the linear map: 

48 48 

Mi^^aiXi + 6 = 0) -> . . . ,x 48 ) + 6 = 0). 

l l 

Theorem 2. The dimension of U is equal to the dimension of U, the dimension of 
U and the dimension of U. U = U = M(U). 

This is proven easily by using the maps M and M. 

Because all linearization relations in V are expressed in the last 59 compo- 
nents <f> 32 ,j(xu • • • , £48), j > 41 and they are all expressed in terms of the quadratic 
polynomial q % \ and they involve only the last 17 variables Xi, i = 32, . . . , 48, we did 
200 samples of randomly chosen values x' 32 ,x ' 33 ,. . . , £4 8 for x 3 2 , . . . , £48? computed 
the corresponding values of 4>32.j,j > 41 for these x 32 ,x 33 ,. . . , x f 48 and then sub- 
stituted the values of 0 3 2 ,j,j > 41 into the 347 linearization equations. We found 
out that these 347 linearization equations in V actually produce 17 linearly inde- 
pendent equations of Xi, i > 31, and by solving those equations we have Xi = x', 
z = 32, . . . , 48. 

Then we notice that if all Xi are set to be zero, which means </> 3 2,i(0, . . . , 0) = 0 
for any z, the linearization equations in V will not produce 17 linearly independent 
equations at all. So instead of choosing randomly the values, we chose (x 32 , . . • , x 48 ) 
to be the ones with many zeros, and we found out ( with 500 random samples) that 
as long as at least 5 of x 3 2 , . . . , £48 are not zero, by substituting the corresponding 
values of $ 32 j, j >41 into the 347 linearly independent linearization equations in 
V, these 347 linearization equations will actually produce 17 linearly independent 
linear equations of Xi, i > 31 and by solving those equations we again recover the 
values of x' by the solution Xi = x', z = 32, . . . , 48. 

Among all possible values of x^ i — 32, . . . , 48, the probability that at most 
5 of them among £*, z = 32, . . . , 48 to be non zero is 17 2 c / 7 2 x8 = 2 lf x % < 2 -82 . 

Therefore we have a probability 1 — jffirk > 1 — 2 -82 that the linearization equations 
will produce 17 linearly independent equations for a given set of values of 0 3 2,; 




The New TTM Schemes Are Not Secure 



121 



and solving those equations will recover the values of x 82 , . . . , x' 48 if we are given 
the corresponding values of 032, j, j >41. 

With Theorem 1 and Theorem 2, we conclude that with the probability 
1 — 2 I 2 X 8 > 1 — 2 -82 , the linearization equations of yi in V will produce 17 linearly 
independent equations satisfied by x t for a given ciphertext (y [, . . . , y f 10 0 ). This is 
the first step of our attack. Here we would like to emphasize that the statement 
about the probability to derive 17 linearly independent linear equations from a 
ciphertext is based on computational experiments not on any theoretical argument 
and it seems possible to actually prove it. 

Let’s assume that we now have 17 linearly independent equations in U derived 
from a ciphertext (y [, . . . , y[ 00 ) and its substitution in V. Let (x [, . . . ,x 48 ) be 
the corresponding plaintext. This set of linear equations surely is not enough to 
recover the original plaintext. However, we know that if we have seventeen linearly 
independent equations, we can use Gaussian elimination method to find two sets: 
A = {txi, . . . ,^17}, B = {iq, . . . ,t>3i}, A fl B = 0 and A U B = {1, . . . ,48}, such 
that we can derive 17 linearly independent linear equations in the form x Uj = 

hj ( x Vl , . . . , Xy 31 ). 

Then we substitute these 17 equations into the yi , which will become qua- 
dratic polynomials with only 31 variables. We will call this new set of polynomials 
yi. They can be viewed as components of a map from K 31 to K 100 , which will be 
denoted by F. 

Let 0o be the map from K 31 to A 48 , which is given by: <t)o,i(x Vl , • • • , x V31 ) = %i 
if i € jB, otherwise 0o,* (x Vl , . . . , x V31 ) = hi (x Vl , . . . , x V31 ) , then 

F = 04 O 03 O 02 0 7T 0 01 ° 00 • 

Prom the point of view of algebraic geometry, the substitution process is 
nothing but evaluation of the yi on the variety defined by the 17 linearly indepen- 
dent linear equations x Ui = hi(x Vl , . . . , V31) and the existing variables are nothing 
but the coordinates of this variety. 

Because for the case of 0 32, if the dimension of U is 17, the variety is defined 
by Xi = x\ for i — 32, . . . , 48 and x\ e K , with Theorem 1 and Theorem 2, we 
know that the variety defined by linear equation in U is the same variety defined 
by 0i,i (xi, . . . , £ 43 ) = (f>i,i(xi , . . . , #4 8 ), for i > 31 and we denote this variety by 
W. The linear equations in U are nothing but linear combinations of this set of 
linear equations. 

Let 



032 = 032 O 0 ! O 0 o(x Vl , . . . , X V31 ) = ( 032 , 1 , • • • , 032, 100 ) 
and also define 

0io(*^t;i , • • • , Xy 31 ) 01 O 0o(:E'u 1 , • • • , Xy 31 ) (010,1 , • • • , 010, 10o)* 




122 



J. Ding and D. Schmidt 



Then using the expansion formula of (j) 32, we have: 

032,2 = 010,2 -fa \ 4 (3i4 (010, 38010, 48 + 010,47010,46) 

+ a \ 4 ^2lPij (010,38—2.7 010,48 + 010,39-2^010,47) = 010,2 + -R* ? for l = 1 ; 



032, 2 — 01O,i + fi (010,1 ? • • • >010,2-1 ) + a| 4 $4 (010,38010,48 + 010,37010,46) 





+ a \ 4 Z)l Pij (010,38— 2j 010,48 


+ 010,39- 


-2j01O, 


\l) = 


01 


0,2+^? 


i = 2,3 


4; 


032 ,2 


= 01 


0,2 T /z(01O,l 7 ••• 7010,2— 1 


), 










i = 5,.. 


.,41; 


032 ,2 


~~ Qi~ 


-31 (010,38 7 • • • 7 010,48)7 












i = 42, 


..,48; 


032 ,2 


= Qi- 


-31 (010,38 7 • • • 7 010,48)? 












*=49, 


•■,76; 


032 ,2 


= Qi- 


-72 (010, 36 ? 010, 39 ?010, 40 ? 


•••?01O,45 


?01O,37 


7010, 


47, 


010,48)? 


i = 77,. 


■••,84; 


032 ,2 


= Qi- 


-85 (010,34 ? 010,39 ?01O,4O ? 


•••?01O,45 


?01O,35 


7010, 


47? 


010,48)? 


z = 85, . 


,..,92; 


032 ,i 


= Qi- 


-93 (010,32 ?01O,39 ?01O,4O ? 


•••?01O,45 


?01O,33 


7010, 


47? 


010,48)? 


z = 93, . 


,,100 



where R[ — Yi PijPh and -P/? for & = 1, 2, 3, is given as 

01,31+2+1 (*^l v • 7^48)01,48 (^l 7 • • • ,*^48) + 01,31+i(*^l 7 • • • 7^48)01,47 (*^1 7 • • • 7*^48)' 

P 4 01,42 (*^ 1 ? • • • ? *^48)01,48 (*T 1 7 • • • 7 *^48) + 01,46 (*^1 7 • • • 7 ^48)01, 47 (•£ 1 7 • • • 7 *^48 )’ 
which are constants. Namely Ri(<j> 32 (^ 1 , • . • , ^ 4 s)) are constants on the variety W. 
Therefore 

F(x v 1 7 • • • 7 V 31 ) = (fo 7 • • • fooo) = 04 0 03 0 02 0 7T 0 01 ° 00 (^vi 7 • • • 7 ^+31 ) 7 

where 0 3 = (0 3 ,i, . . . , 03,ioo) is given by 0 3)i = for z = 5, . . . , 100; and 0 3 , 4 = 
X 4 + for z = 1,2, 3, 4. Therefore 0 3 on the variety IT is equivalent to 0 3 , which 
is linear and is just a translation. 

Then 

F(x Vl ,.. .,Xy 31 ) = (04 O0 3 ) 0 02 O (?T O 0! O 0 O ) = 04 ° 0 2 O $ 1 , 

where 0 4 = 0 4 o 0 3 , Iq = 7r o o 0 O and both 0 4 , which is invertible, and 4q, 
which is injective, are linear maps. 

Then F(x Vl , . . . ,x W31 ) = (yj, . . . , 2/ioo) can be easily solved because of the 
triangular form of 02 7 namely the equation above is equivalent to the equations: 

04 2 / 100 ) = 02 o4i(i„,,. . . ,x„ 3l ) = 0^ 1 (2/'i,. ..,2/' 100 ), 

whose first nontrivial equation is always a linear equation. 

This shows that the equations can be solved by iteration of the procedure 
of first searching for linear equations by linear combinations of quadratic equa- 
tions, and then substituting the linear equations into the quadratic equations. 
Each time of iteration, we reduce the variable by 1. This eventually will require 
31 iterations to find the 31 linearly independent linear equations in the triangular 
form, whose solution gives the values of the 31 variables x Vi . Then we can substi- 
tute the values of x Vi , i = 1, . . . , 31, back into the first 17 substitution equations 
x Ui = h{(x Vl , . . . , v 3 i), j = 1, . . . , 17, which recovers the complete set of (x'), the 
plaintext. 




The New TTM Schemes Are Not Secure 



123 



Overall, our general method is first to search all linearization equations. Then, 
for a given ciphertext (y [, . . . ,y' m ) corresponding to a plaintext (x[, . . . ,x' n ), we 
use the linearization equations to produce enough (17) linearly independent linear 
equations satisfied by X{. Then we do a substitution using these linear equations, 
which essentially makes ^3 linear on the variety defined by the 17 linear equations. 
The rest becomes straightforward. 

2.2.3. The practical attack procedure and its complexity. We have three steps to 
derive the plaintext (x [ , . . . , x’ AS ) from a ciphertext (y [, . . . , y[oo), an d the first step 
is a common step for any given ciphertext. 

Step 1 of the attack 

We first look for a basis for the space V, namely the basis of solutions of a^-, &*, Cj 
and d for the equations: 

n,m n m 

T. aijXijjjix 1 ,. . . ,x n ) + yb i x i + yc j y j (xi,.. .,x n )+d = 0. 

i=l,j=l i= 1 j=l 

For this set of equations, we have 4949 = 4800 + 48 -f 100 + 1 variables and 
19697 = 1 -h 48 + (24 x 47 + 48) + (48 + 24 x 47 + (8 x 47 x 46)) equations. We 
know that the dimension of the solutions is 347. 

Though we have 19697 equations, we have only 4949 variables, we do not 
need to use all those equations to find the solutions. We can actually randomly 
choose 6000 equations, the probability that we will not find the complete solution 
is essentially zero. To solve these linear equations, is to do row operations on a 
6000 x 4949 matrix. However, because we are working on a finite field with only 
2 8 elements, the row operations corresponding the elimination procedure on each 
column requires at most 2 8 — 1 multiplication of a given row. To eliminate each 
variable, on average, it takes (2 8 — 1) x 6000/2 multiplications. Therefore to solve 
these equations, it requires at most 4600 x (2 8 - 1) x 6000/2 = 2 32 computations 
on the finite field K. This step is also the common step for any attack. 

However, because we are working over the fixed field K , we can perform the 
computation of multiplication on K by finding first a generator g of the multiplica- 
tive group of K , and storing the table of elements 7 in K as g k , then computing 
the multiplication by two searches and one addition. This will improve the speed 
by at least a factor of 2. Therefore, this step takes at most 2 31 computations. 

Step 2 of the attack 

For a given ciphertext (y [, . . . , v[qq), we substitute the polynomials of yi by y\ 
into the 347 linearly independent solutions of the linearization equations in V and 
derive 17 linearly independent linear equations of Xi by the Gaussian elimination 
method in the form of x Uj = hj(x Vl , . . . , x V31 ), where hj is a linear function, 
A = {u\, . . . ,^17}, B = {vi, . . . , V31}, A fl B = 0 and A U B = {1, . . . ,48}. We, 
then, substitute them into yi to make it into polynomials with only 31 variables 
{^ 1 , . . . , 




124 



J. Ding and D. Schmidt 



First with a probability 2 -82 , we might fail to get 17 linearly independent 
equations, which surely can be neglected. 

When we substitute y l by y\, we need to do 347 x (4800 + 100) = 2 21 com- 
putations. Then, to reduce 347 equations to 17 equations for substitution, it takes 
(2 8 — 1) x 48 x 347/2 = 2 21 computations. Then we perform the substitution of the 
17 equations into yi and it takes 100 x (2 x 17 2 + 17 x 31 + 17) — 2 17 computations. 

For the new 100 polynomials with 31 variables, which we denote by y t , we 
will write down first the 100 equations y % — y[ = yn(x Vl , . . . ,x V31 ) = 0, and they 
are linearly dependent and the dimension is only actually 41. 

Step 3 of the attack 

For the equations yu(x Vl , . . . , x vm ) = 0, i = 1,...,100, we will use Gaussian 
elimination method, first on the quadratic terms, to derive m (rh — 41) linearly 
independent equations yu(x Vl , . . . , x V31 ) = 0, i — 1 ,...,m, and the last one is 
linear. Then we take the linear equation out and substitute it back into the leftover 
rh — 1 quadratic equations (the linear one is taken out) yu(x Vl , . . . , x V31 ) — 0, 
i — 1 , . . . , m— 1 . We denote the new equations i) 2 i{x Vl , . . . , x Vi , x Vi+2 , . . . , = 0. 

Then we repeat the same process on these new equations, and later again and again 
for total of 31 times. We then collect all 31 linear equations derived in this process, 
a set of 31 linearly independent equations in the triangular form. The solution 
gives us all the values of x Vi , then we plug them back into 17 linear equations 
x Uj = hj(x Vl , . . . , x V3l ) in Step 2, which will give us x Ui . We recover the plaintext. 

For the first part of this step, we need at most 100 x (31 x 16 + 31) x 100/2 = 
2 22 computations to perform the Gaussian elimination and then the substitution 
takes at most 42 x 31 2 = 2 15 computations. Then we need at most to perform 31 
times these two procedures and therefore it, at most, takes 2 25 computations to 
solve the equations for the 31 variables and then need to do 2 9 computations to 
find the values for the other 17 variables. 

If we add all three steps together, it takes at most 2 32 computations. We did 
simulation of an example of this scheme and it took us a few hours to find the 
plaintext for any given ciphertext. The best way to test our method is definitely 
to attack the challenges set by T. Moh. However, at this moment the web site 
(www.usdsi.com), where the challenge is posted, does not allow public access to 
the challenger’s data and we plan to do so once it is available. 

2.3. Cryptanalysis for the Scheme in the First Version of [2] 

Our work was first done for the scheme in the first version of [2]. When we had 
finished the work on this scheme in July last year, the new version appeared. In 
this section, we will present our work on the implementation scheme in the original 
version of [2]. The construction of the scheme is similar to the revised ease above. 

We again work on the field K of size 2 8 . A map F is made of 0 i, 02, 03, 04, 
which are maps from the ( n + 68)-dimensional space to itself. 0i , 0 4 are invertible 
affine maps, and 02 and 03 are nonlinear of de Jonquieres type but different from 
that of the section above. 




The New TTM Schemes Are Not Secure 



125 



Again a map 

F(xi,...,x n ) = (p4 0 (j) 3 O (f>2 0 (pi(Xi,X2, ■ ■ ■ ,£„,0, • • • ,0) = (j/l,...,J/n+ 68 ) 

is the cipher, which is public, but 0i, 04 are private. To make sure the system is of 
degree 2 , another set of polynomials Q 8 (z i, . . • , 2 : 43 ) and qi(zi , . . . , 2 : 14 ) are used. 
The detail of 03, 02 and Q$ and qi are given as follows with m — n - b 68 . 

We will not give the exact detail of 0 3 and 0 2 (we refer it to the origi- 
nal version of [2]), rather we will give the detail of 032 = 03 o 0 2 o 7 r, where 
7 r(xi, . . . ,x n ) = (# 1 , . . . ,x n ,0, . . . ,0) is a map from K n to K' n+68 : 

032,1 = 2/1 = + Qs(yn-7 , • • • 5 2/n-(-34) + Q8(2/n-21,2/n-14,2/n+35, • • • ,2/n+68) = 

•^1 “I” ^n— 5^n — 6 “1“ ^n—8^n 4“ 19^n— 20 T X n —22^n— 14? 

032,2 2/2 = *^2 T Qs(2/3j • • • , 2/m) = ^2 4* #i 4“ QsilJn — 21 , 2 /n— 14 5 2/n+35? • • • 5 2/n+68) = 

#1 T ^2 + #n-19#n-20 + £n-22^n-14, 

032, » = Vi = Xi H- /i(xi, . . . ,Xi_i), i = 3, . . . ,n - 22; 

032, i 2/i Qi— (n— 24) {%n— 27 , • • • , ^n— 14 ), % — Tl 23, • • • , Tl 14, 

032, i = 2/t = ^i + /i(xi,...,Xi_i), i = n- 13,..., n- 8; 

032, i 2/i Qi—(n— 8) (*^n— 13 , • • • , *£ 7 1)5 ^ = 7i 7, . . . , 77. H - 34, 

032,i 2 H Qi— (n+26) {p^n—27 , • • • , ^n— 14 )? 2 = 71 -f- 35, . . . , Tl T 68, 

where 

CSC#!? • • • ,942) = [914923 + 917924 ] [91099 + 9694] 2 [9ll930 + 9l93l] + 

[933934 + 935936] [915937 4- 916926] + [91998 + 920938] [91397 + 91295] + 

[921939 -T 94092 4- 922941 4- 94293 ]; 
qi are functions of 14 variables 2 $, 22 , . . . , 214 : 

9l = 27 -f 2 2 2 : 5 , 92 = 2 8 + 2 6 2 7 , 93 = z 9 4- 26 ^ 5 , 

94 = 2 2 2: 4 4- 210, 95 = ^5 4- 2ll, 96 = ^1^3 + Z 12 , 

97 = 2 3 2 7 + 213 , 98 = 214 4- 2 8 2 3 , q 9 = Z 3 + 2: 2 2 i2 , 

910 = 24 4- 2 io 2 i, 911 = 213 4- 2ll2 2 , 9l2 = ^4^13 + 2 7? . 

9l3 ~ 24211 4- 25 , 9l4 = 26 4- 29 2 2 , 9l5 — 214 + Z9Z10, 

916 = 2 g + 210 ^ 6 , 917 = 29 4- 2 i 2 6 , 9 18 = 2 9 2 i, 

9l9 = 29 Z 4 + 26 , 920 = 26^3 + 2 9 , 9 2i = 2 : 7 2 9 + 2 i4 , 

922 = 29213 4- 2 6 , 923 = 2 i 2: 8 4" Z 14 , 924 = 2 2 2: i4 4~ 

925 = 21421 , q26 = 2io2i4, 9 27 = 2 2 Zio, 

928 = 2 2 2 3 , 929 = 212 : 11 , 930 = 212:7 4 - 25 , 

931 = 21213 + Zn, g 32 = 2 : 125 , 933 = Z12ZH 4- 213, 

934 = 2 i2 2 7 , 935 = 2i 2 2i 3 , q 36 = Z 12 Z 5 + Z 7y 

937 = 2i 0 2 8 , 938 = 24214 + Z 8 , 939 = Z 8 Z U , 

940 = 2i 4 2ii, <741 = 2 8 2 5 4- 214 , 942 = 2i 4 2i 3 + 2 8 . 



and 



Q8(9l>--->942) = 2 8 2 9 4-2 6 2l4. 




126 



J. Ding and D. Schmidt 



Through computations and similar argument as in the section above, we have: 

1) The dimension of V of the space of linearization equations for the components 
Hi of F is 286; 

2) For a given ciphertext (y [, . . . , y§%+ n ), with a probability of 1 — = 1 — 

2 -70 , the linearization will produce 28 linearly independent linear equations 
of Xi. 

3) For the case of 28 linearly independent equations, we can again do a sub- 
stitution using these 28 linear equations into yi to derive a new operator F 
which is a map from K n ~ 28 to F n+68 and F = </> 40^0$^ for some linear 
maps </>4 invertible, and 4q injective. 

This allows us to use exactly the same attack steps as in the previous section to 
defeat this scheme. 

Here if we choose n = 52, m = 120, we estimate that it takes about 2 35 
computations on K to defeat the scheme. We performed a computation example 
on such a scheme on a PC and it took a few days to find the plaintext from a given 
ciphertext. 

By now, there are several implementation schemes that have been suggested 
by the inventor. We notice that for all cases, due to the fact that the qi components 
are all very simple and they never have more than 2 quadratic monomials, it is easy 
to see that for all schemes, the dimension of linear space of all the linearization 
equations for the components yi of F is not small. This is a common defect for 
the implementation schemes, which is not in any way desirable for a secure open 
key cryptosystem. However even with those linearization equations, it does not 
necessarily mean that finding the plaintext from a given ciphertext is easy. One 
example is the first implementation [7] to demonstrate the situation. But this 
schemes was defeated by another method [5]. 

3. Conclusion 

A key component of the TTM schemes is a set of quadratic polynomials q % . These 
polynomials are very simple and often a qi consists of just one degree two mono- 
mial. Due to this fact we show through computations that in all TTM implemen- 
tation schemes, the polynomial components yi of the public cipher F satisfy lin- 
earization equations and for a given cipher text ?/', we can obtain linear equations 
satisfied by the plaintext x \ . This is something that a secure open key cryptosystem 
should not have. This defect does not necessarily allow us to defeat all implemen- 
tation schemes easily, but for the cases of the two most recent implementation 
schemes suggested in two different versions of [2], we show that, with a very small 
probability of failure, this defect allows us to defeat the schemes easily. For the 
suggested practical example in the revised paper [2], we show that it takes 2 32 com- 
putations on the finite field K to defeat the scheme and we confirmed this by a com- 
putation example. For the case of the scheme in the original version of [2], it takes 
2 35 computations to defeat it, and it is confirmed by a computer experiment as well. 




The New TTM Schemes Are Not Secure 



127 



Also for the existing TTM implementation schemes, we can even find higher 
order type of linearization equations, which a secure scheme should avoid as well. 

Considering all the attacks on the TTM schemes, [3, 5] and this paper, we 
conclude that all existing TTM schemes are insecure. But we do not, in any way, 
suggest that our results imply that there do not exist good TTM schemes. We 
do conclude that to avoid the defect we found in this paper, more sophisticated 
construction of Qg and qi is needed. We think this is a very interesting direction 
to pursue, but it needs some deep insight from algebraic geometry. 

References 

[1] Chou, G., Guan, J., Chen, J., A systematic construction of a Q 2 k-model in TTM , 
Comm, in Algebra, 30(2), 551-562, (2002). 

[2] Chen, J., Moh, T., On the Goubin-Courtois attack on TTM , Cryptology ePrint 
Archive (2001/72). 

[3] Goubin, L., Courtois, N., Cryptanalysis of the TTM cryptosystem , Asiacrypt2000, 
LNCS 1976, 44-57. 

[4] Dickerson, M., The inverse of an automorphism in polynomial time, J. Symbolic 
Comput. 13 (1992), no. 2, 209-220. 

[5] Ding, J., Hodges, T., Cryptanalysis of an implementation scheme of TTM, Depart- 
ment of Mathematical Sciences, University of Cincinnati, Preprint 2002. 

[6] Matsumoto, T., Imai, H., Public quadratic polynomial-tuples for efficient signature- 
verification and mess age- encryption , Advances in Cryptology - EUROCRYPT ’88 
(Davos, 1988), 419-453, Lecture Notes in Comput. Sci., 330, Springer, Berlin, 1988. 

[7] Moh, T. T., A fast public key system with signature and master key functions, Com- 
munications in Algebra, 27(5), pp. 2207-2222 (1999) &; Lecture Notes at EE Depart- 
ment of Stanford University. (May 1999) & http://www.usdsi.com/ttm.html 

[8] Patarin, J., Cryptanalysis of the Matsumoto and Imai public key scheme of Euro- 
crypt’88., Des. Codes Cryptogr. 20 (2000), no. 2, 175-209. 

[9] Patarin, J., Hidden field equations (HFE) and isomorphism of polynomials (IP): Two 
new families of asymmetric algorithms , EuroCrypt’96, Lecture Notes in Comput. 
Sci., (1996) Ueli Maurer ed., 33-48. 



Jintai Ding 

Department of Mathematical Sciences 
University of Cincinnati 
Cincinnati, OH, 45221-0025, USA 
e-mail: dingQmath.uc.edu 

Dieter Schmidt 

Department of Electrical and Computer Engineering and Computer Science, 
University of Cincinnati 
Cincinnati, OH, 45221-0030, USA 
e-mail: dieter.schmidtQuc.edu 




Progress in Computer Science and Applied Logic, Vol. 23, 129-137 
© 2004 Birkhauser Verlag Basel/Switzerland 



The Capacity Region of Broadcast Networks 
with Two Receivers 

Elona Erez and Meir Feder 



Abstract. According to the max-flow min-cut theorem a source s can transmit 
information to a sink £ in a graph (V, E) at a rate that does not exceed the 
capacity of the minimal cut that separates the source and the sink. Recently, 
it has been shown that if the intermediate nodes in the network are allowed 
to code the information that they receive, then the source s can multicast 
common information to several sinks at a rate that does not exceed the min- 
cut between the source and any of the individual sinks. In this paper we find 
the achievable rate region when there are two receiver nodes £i and £2 , but 
we allow both common information at rate Ro and private information rates 
to ti and £2 at rates respectively. 

Keywords, network codes, broadcast channel, multicast, min-cut max-flow 
theorem. 



1. Introduction 

According to the max-flow min-cut theorem a source s can transmit information to 
a sink £ in a directed graph G = ( V , E) at a rate that does not exceed the capacity 
of the minimal cut that separates the source and the sink. Any edge (i,j) in the 
network is assumed to be free of transmission errors and have capacity of C %3 bits 
per channel use. It has been shown by Ahlswede et al. [1] that if the intermediate 
nodes in the network are allowed to code the information that they receive, then 
the source s can multicast common information to several sinks at a rate that 
does not exceed the minimal of the min-cuts between the source and any of the 
individual sinks. In this work we find the broadcast capacity region when there is 
a single transmitter s and two receiver nodes £1 and £2, but we allow both common 
information at rate Rq and private information rates to t\ and £2 at rates i?i, R 2 , 
respectively. The special case of R 2 = 0 was solved using an algebraic formulation 
in [2]. We also show how to construct codes that achieve any triplet (i?o, Ri, R 2 ) 
in the capacity region. Specifically, we show that this situation is equivalent to 
the common multicast case in an extended network that we define. The code for 




130 



E. Erez and M. Feder 



the original network is derived from the common multicast code of the extended 
network. We note that the fact that the entire capacity region for the broadcast 
network can be easily found is rather surprising since the parallel problem of 
two-user broadcast channel in multiuser information theory is in fact not solved 
yet [3], [4], [5]. As it turns out, the resulting capacity region is what one would 
intuitively expect by postulating achievability of min-cut bounds. In the sequel we 
term by unicast network the case where there is no common information. We term 
by multicast network the case of no private information. We term by broadcast 
network the case of both private and common information. 



2. Unicast Network 

As mentioned in the introduction, in the multicast case, when all the sinks have 
to receive the same information, the maximal rate can be achieved through coding 
at the intermediate nodes. 

At the other extreme, which is termed the unicast case, where each receiver 
is required to receive different information, the capacity region, as noted in [2], 
can be achieved without coding. Suppose that there are L receivers, £i,...,£l 
required to receive information at rates jRi, . . . , Rl- We want to verify whether the 
rate requirements are feasible. The original network can be extended to another 
network, as can be seen in Figure 1, where each sink U is connected to the 





(a) Original Network (b) Extended Network 

Figure 1. Unicast Network 

supersink T. The capacity of the link that connects U with T is Ri. Clearly, there 
is one-to-one correspondence between communication from s to T at rate R\ + 

R 2 H b Rl in the extended network and communication from s to U, £ 2 , • • • 

at rates Ri, R 2 , . . . , Rl in the original network. Since the maximal rate in the 




The Capacity Region of Broadcast Networks with Two Receivers 131 



extended rate is between a single source to a single sink, no coding is required. The 
min-cut bound can be achieved using Ford-Fulkerson algorithm for maximal flow. 
Inspecting the various cuts in the extended network, it follows that the capacity 
region for the original network is: 

^2 Ri < mincut(s; 0), V0 C {ti,t 2 ,-,t L } (1) 

ti G0 

where mincut(s; 0) denotes the minimal cut between s and the subset of sinks 0. 



3. Main Result 

In the case of two receivers we show how to solve the intermediate case, when 
there is both common and private information to deliver. Suppose it is required 
to design a broadcast code with rates (Ro, Ri, # 2 )- It is first needed to verify that 
these rates are within the capacity region, which can be found as follows. The 
original network G can be transformed into an extended network G'. In G ' there 
are two supersinks T\ and T 2 as in shown in Figure 2. The sink t\ is connected to 




(a) Original 
Network G 




(b) Extended 
Network G’ 



Figure 2. Broadcast Network 

node t[ through a link of capacity Ro H- Ri. The node t[ is connected to T\ and T 2 
through links of capacity Ro + Ri and R\ , respectively. The sink t 2 is connected 
to node t' 2 through a link of capacity Ro + R 2 . The node t' 2 is connected to T\ and 
T 2 through links of capacity R 2 and Ro -F R 2 , respectively. 

As was shown in [6], [2], [7] for the common multicast case linear codes achieve 
the maximal rate. We use notations similar to [7]. We assume that each link has 
capacity of 1 bit per channel use and that there are C ZJ parallel links from node i 
to node j. Denote by h bits per channel use the rate of the code. Any link e has 
a vector b(e) of dimension h associated with it. We use the notation T i(v) and 
ro(^) for the set of edges reaching and leaving node v, respectively. The source 




132 



E. Erez and M. Feder 



node s gets h binary input symbols 1 denoted Xi, . . . , X^. The vector b(e) G 2 h 
associated with edge e is given by: 

b(e) = ^ m e (e')b(e') (2) 

e'£Ti{v) 

where b(e') is the vector associated with the incoming edge e’ into e and m e (e!) 
is the coding coefficient. 

Assume y(e ) is the symbol transmitted on a link e which is given by 

y( e ) = 53 m e (e')y(e') = b(e) T x (3) 

e'GT j(v) 

where x = (Xi, . . . , Xh) T denotes the input vector. Each node v has a subspace 
associated with it U (v) given by: 

U(v) — span{b(e) : e G Tj(v)} (4) 

If the dimension of U ( ti ) is h, then sink ti can reconstruct the transmitted message 
[6], [2]. 

Lemma 3.1. The rate (i?o, i?i, R 2 ) is achievable in the original network G if and 
only if Rq + Ri + R 2 is an achievable multicast rate in the extended graph G ' . 

Proof: If (Rq, ij 1? R 2 ) is achievable in G, then t\ can transmit Ro + Ri bits to T\ 
and t 2 can transmit its private R 2 bits to T \ . Thus T\ can reconstruct Ro + R\ + R 2 
bits. Likewise, T 2 can reconstruct Ro + R\ + R 2 bits. Note that no processing is 
required in this case by the supersinks, except detecting the incoming information. 
For the opposite direction, if rate Ro + Ri -f R 2 is an achievable multicast rate 
in the extended graph G f then it can be achieved by a linear code. For this code, 
since the entire message is reconst ructible by T\ and T 2 , it follows that: 

dim{£/(Ti)} = dim {U{T 2 )} = Ro + Ri + R 2 (5) 

where U(-) is defined in (4) and is a subspace of a Ro -f Ri + i? 2 -dmiensional vector 
space. It also follows that 

dim {U(t[) + U(t' 2 )} = Ro + Ri+R 2 (6) 

where in the LHS ‘H-’ denotes direct sum subspace. Since the capacity in the 
incoming link to t[ is Ro + R\ we have: 

dim {U(t[)}<Ro + Ri. (7) 

From the following relation 

dim{G(Ti)} < dim{J7(ti)} + R 2 (8) 

and (5) it follows that: 

dimiUit',)} > Ro + Ri (9) 



x As shown in [7] in the case of two receivers a binary field suffices for the code. 




The Capacity Region of Broadcast Networks with Two Receivers 133 



Thus, 








dim{t/(ti)} = Ro + Ri. 


(10) 


Likewise 








dimjf/^)} = Ro + R^. 


(11) 



Recall the following general relation: 



dim {U(t[) n U(t' 2 )} = dim {U(t[)} + dim {U(t' 2 )} - dim {U{t[) + U(t' 2 )} (12) 

Prom (6), (10), (11) and (12) it follows that: 

dim{U(t' 1 )nU(t , 2 )} = R 0 (13) 

We show now by construction that there exists a ( Rq,Ri,R 2 ) code for G. 
Denote by U c the intersection subspace: 

U c = U(t[) nU{t 2 ) = span{ci, . . . ,c Ro } (14) 

where {ci, . . . , cr 0 } is the basis of f/ c . Similarly denote by U a and Ub the following 
subspaces: 



U a = U(t[)\U c = span{ai,...,a Rl } 

U b = U(t 2 )\U c = span{bi,...,b Ra } (15) 

Clearly, (7 a , Ub and U c have an empty intersection. We have 
U (t[) = span{ai, . , a Rl , ci, , c Ro } 

U(t 2 ) = span{bi,...,b Ra ,ci,...,c Ro } (16) 

Define the matrix M and the vector d by the following matrix relation: 



Mx = 



a i 



a Rl J 

bi T 



b Ra 

Cl^ 



c Ro 



( Xl \ 




-d 



\ d Ro + Rl +R 2 ) (17) 



Prom (6) it follows that M is full rank. Thus, if {Xi, . . . , X Rq + Ri+R2 } are statisti- 
cally independent and uniformly distributed, so are {di, . . . , d Ro + Rl + R2 } and vice 
versa. The first i?i symbols of d can be reconstructed at t[ only and are therefore 
the private data to t[, and hence also to t\. The next i ?2 symbols of d are the 
private data to t 2 , and hence also to ^ 2 - The last Ro symbols of d are the common 
data. Therefore, (Ro,Ri,R 2 ) is achievable in G. □ 




134 



E. Erez and M. Feder 



It follows that in order to find the achievable region for G, we have to find the 
achievable multicast rate for G', according to the min-cut. The following theorem 
can be proved immediately using the lemma. 

Theorem 3 . 1 . The achievable rate region ( Ro , i?i, R2) for the broadcast network G 
is given by: 



Ro + R\ 


< 


mincut (s; t\) 


(18) 


Ro + R2 


< 


mincut (s;t 2 ) 


(19) 


Ro + R\ + R2 


< 


mincut (s; t\, t 2 ) 


(20) 



Proof The bounds (18) and (19) are trivial. The potential minimal cuts between 
s and Ti, as shown in Figure 3, can be divided into three characteristic types. 
Cuts of type a yield the bound Ro + Ri + R 2 < Ro + R\ + cut(s, t\, t 2 ) or R 2 < 
cut(s, ti\ t 2 ). Cuts of type b yield the bound Ro + R\ + R 2 < R 2 + cut(s, t 2 \ t\) or 
R0+R1 < cut(s, t 2 ; £1). Cuts of type c yield the bound Ro+Ri+R 2 < cut(s; t 2 ,t\). 
Similar bounds hold for T 2 . It follows that the restrictive cuts can be only of type 
b and c. The bounds of the theorem follow. □ 




Figure 3. Characteristic Cuts 

Note that for the multicast case, i.e., R\ = R 2 — 0, the capacity region be- 
comes i?o < min{mincut(s,ti),mincut(s;t 2 )} ? as expected. For the unicast case, 
i.e., Ro = 0, the supersinks T\ and T 2 are equivalent and therefore a single super- 
sink suffices and no coding is required, as expected. 

There is a one-to-one correspondence between a multicast code in G' and a 
broadcast code G. In order to design a broadcast code for G, we begin by designing 




The Capacity Region of Broadcast Networks with Two Receivers 135 



a multicast code in G'. The multicast code can be designed using the polynomial 
time algorithm developed in [7]. Given a multicast code for G', a broadcast code 
for G is derived using simple processing, as explained in the proof of Lemma 3.1. 
The procedure is illustrated in the following example. 

Example. Figure 4 shows the graph, already with its extension (dashed lines). 
Unless otherwise specified, the capacities of each edge is 1 bit. The capacity region 




(a) Constructing Multicast 
Code in G’ 



(b) Resulting Broadcast 
Code in G 



Figure 4. Code Construction 



is given by: 



Rq -f- R\ ^ 3 
Ro T R 2 5: 3 

Ro + R\ + R 2 < 4 (21) 

Therefore, the rate (Rq,Ri,R 2 ) = (2,1,1) is in the capacity region. We design 
the multicast code for the extended graph, as shown in Figure 4. The information 
received by t[ for this code is 61, b 2 + 64, b 2 + 63 H- 64 and the information received 
by t' 2 is 63, 64, b 2 + 63 + 64. Thus we have: 



U(t[) = span{(l, 0,0, 0) T , (0,0,1, 0) T , (0,1,1, if} 
U(t' 2 ) = span{ (0,0,0, 1) T , (0,0, 1,0) T , (0, 1, 1, 1) T } 



( 22 ) 




136 



E. Erez and M. Feder 



Thus, 

U a = span{(l,0,0, 0) r } 

Ub = span{(0, 0,0, l) r } 

U c = span{(0,0, 1,0) T , (0, 1, 1, 1) T } (23) 

and 

/1 000\/6i\ 

Mx — 0 0 0 1 ^2 

0 0 10 b 3 

\ 0 1 1 1 / V b 4 

It follows that di, d 3 , can be reconstructed at t\ and g? 2, d 4 at £2 • Thus d 3 and 

^4 are the common information, whereas d\ and d 2 are the private information. 
Since M is guaranteed to be full rank, matrix inversion in M of (24) yields: 

b\ = d \ , 62 = c?2 H- d 3 + ^4, b 3 = ^3, 64 = g ?2 (25) 

The final code constructed is also given in Figure 4. Note that unlike the code for 
the extended graph, the code for G requires (pre)coding at the source. 

4. Conclusion and Further Research 

In this paper we have shown how an extension of a graph enabled us to find the 
capacity region and to construct network codes for more general scenarios than 
multicast. Unfortunately, it is not known whether it is possible to extend this 
technique to more than two receivers, and what is the capacity region for that 
case. For example, in [6] the scenario of Figure 5 is given. The random process Xq 
is the common information with rate Ro and the random process X 2 is the private 
information of £2 with rate R 2 . By inspection, it can be seen that the bounds on 

x 0 !x 2 



x 0 ,x 2 

Figure 5. Broadcast Network With Three Receivers [Yeung 2002] 

the rates are given by 2Rq + R 2 ^ 2. However, we have not found an extension of 
the graph that will enable to solve it. The problem seems to be that whereas for the 
two receivers case, the only requirement is that a certain amount of information 






The Capacity Region of Broadcast Networks with Two Receivers 137 



will be private, and a certain amount common, here we have stricter requirements 
on exactly which information is common and which is private. 

Interference is a different scenario in a network, when a source si has to 
transmit information to a sink t\ with rate R\ and a different source S2 has to 
transmit information to £2 with rate R2. A possible code, which is not necessarily 
optimal, can be achieved by joining si and $2 to a super source 5 with links of 
capacities R\ and R2 , respectively. The super source S is then required to multicast 
the same information to t\ and £2- Better rates can be shown to be achieved using 
our method, where only a certain part of the information is common, and the rest 
is private. 

References 

[1] R. Ahlswede, N. Cai, S.-Y.R. Li, and R.W. Yeung, “Network information flow,” 
IEEE Transactions on Information Theory , vol. 46, no. 4, pp. 1204-1216, July 2000. 

[2] R. Koetter and M. Medard, “An algebraic approach to network coding,” Proceedings 
of INFOCOM , 2002. 

[3] T.M. Cover, “Comments on broadcast channels,” IEEE Transactions on Information 
Theory , vol. 44, pp. 2524-2530, 1998. 

[4] K. Marton, “A coding theorem for the discrete memory less broadcast channel,” IEEE 
Transactions on Information Theory , vol. 25, pp. 306-311, 1979. 

[5] T. M. Cover, “An achievable rate region for the broadcast channel,” IEEE Transac- 
tions on Information Theory , vol. 21, pp. 399-404, 1975. 

[6] R. Yeung, A First Course in Information Theory , Kluwer Academic/Plenum Pub- 
lishers, March 2002. 

[7] P. Sanders, S. Egner, and L. Tolhuizen, “Polynomial time algorithms for network 
information flow,” 2002. 



Elona Erez and Meir Feder 

Dept, of Electrical Engineering- Systems 

Tel Aviv University 

Tel Aviv, 69978, Israel 

e-mail: elona@eng. tau. ac . il 

e-mail: meir@eng.tau.ac.il 




Progress in Computer Science and Applied Logic, Vol. 23, 139-152 
© 2004 Birkhauser Verlag Basel/Switzerland 



Constructions of Nonbinary Codes 
Correcting t - Symmetric Errors and 
Detecting All Unidirectional Errors: 
Magnitude Error Criterion 

Fang- Wei Fu, San Ling and Chaoping Xing 



Abstract. In this paper, based on residue rings of polynomials, we present 
a general construction for nonbinary codes capable of correcting t or fewer 
symmetric errors and detecting all unidirectional errors with the magnitude 
error criterion. Some new lower bounds for such codes are obtained from this 
general construction. 

Mathematics Subject Classification (2000). Primary 94B15; Secondary 94B60. 
Keywords. Coding theory, nonbinary codes, code construction, magnitude er- 
ror criterion, unidirectional errors, residue rings of polynomials. 



1. Introduction 

Let V = {0, 1, • • • , m — 1} be a finite set where m > 2 is a positive integer. Let 
V n be the set of n-tuples over V, i.e., 

V n = {(xi,x 2 , ■ ■ ■ ,x n ) | Xi e V, i = 1,2, • •• ,n}. 

For x = (x\,X 2 , • ■ ■ ,x n ) E V n and y = (t/i , 2 / 2 , • - • ,y n ) S V n , the Hamming 
distance d#(x,y) between x and y is the number of coordinates in which they 
differ, i.e., 

<fo(x,y) =| {i | Xi ^ yi} | . 

The Zd-distance d\ (x, y) between x and y is defined as 

d l( x -y) = E” 1 I X i ~Vi\ ■ 



This research work is supported in part by the DSTA project (POD 0103223), the National 
Natural Science Foundation of China under the Grant 60172060, the Trans-Century Training 
Program Foundation for the Talents by the Education Ministry of China, and the Foundation 
for University Key Teacher by the Education Ministry of China, the MOE-ARF grant R-146- 
000-029-112 and the 100-Talents program of the Chinese Academy of Science. 




140 



Fang- Wei Fu, San Ling and Champing Xing 



Obviously, for m = 2, di(x, y) = d//(x, y). Denote 

E n 

ma x{y z - x n 0}. 

i— 1 

The asymmetric distance d a (x,y) between x and y is defined as 
d Q (x,y) = max{iV(x,y),JV(y,x)}. 

Clearly, 

d\ (x, y) = iV(x,y) + N(y,x). 

For x = {x\ , • • • , x n ) G V n and y = (j/i, • • • ,y n ) G V™, we say x < y if x x < y % for 
all i. Note that if x < y, then JV(y,x) = 0 and di(x,y) = d a (x, y) = iV(x,y). 

A nonempty subset C of W is called an m-ary code of length n. Further- 
more, if \C\ — M, then C is called an (ra, M) m code. Any word c in C is called a 
codeword of C. The code C is used to transmit information in digital communi- 
cation systems. In classical coding theory, when a codeword c 6 C is sent and a 
vector y G V n is received, the number of errors occurred is defined as the number 
of coordinates in which they differ, i.e., dn{ c,y). We call this error criterion the 
Hamming error criterion. Note that the magnitude of the difference at each of 
these coordinates is not important in this definition. If one wishes to take into 
account the magnitude of each symbol error, a suitable and widely used definition 
for the number of errors occurred is Yli=i\yi ~ c *l> i- e -> ^i(c,y). We call this error 
criterion the magnitude error criterion. In this paper, we study the constructions 
of codes with the magnitude error criterion. This topic has been dealt with in [10], 
[12], [15], [29] and [32]. 

Three types of errors, asymmetric errors, unidirectional errors and symmetric 
errors, are defined as follows (see [2]). Suppose a codeword c G C is sent and a 
vector y G V n is received. The number of errors occurred is di(c, y). 

(i) We say that c has suffered asymmetric errors if y < c. 

(ii) We say that c has suffered unidirectional errors if either y < c or c < y. 

(iii) In general we say that c has suffered symmetric errors. 

Note that for the symmetric errors, we do not impose any specific relation between 
c and y (such as y < c or c < y). The following theorem (see [10], [15], [32]) gives 
necessary and sufficient conditions on block codes correcting/detecting certain 
types of errors. 

Theorem 1.1. With the magnitude error criterion, 

(i) a code C is capable of correcting t or fewer symmetric errors if and only if 
di(x,y) > 2t -F 1 for all x, y G C and x ^ y; 

(ii) a code C is capable of correcting t or fewer asymmetric errors if and only if 
d a (x, y) > t + 1 for all x, y G C and x ^ y; 

(iii) a code C is capable of correcting t or fewer symmetric errors and detecting 
all unidirectional errors if and only if 7V(x,y) > t + 1 and Af(y,x) > t + 1 
for all x, y G C and x / y. 




Constructions of Nonbinary £-SEC/AUED Codes 



141 



Note that for m = 2, the magnitude error criterion is just the Hamming 
error criterion. Hence, Theorem 1.1 generalizes the corresponding theorems for 
binary codes to ra-ary codes with the magnitude error criterion. Actually, one can 
prove Theorem 1.1 by extending and modifying the arguments used for proving 
the corresponding theorems for binary codes (see [2] and [32]). Clearly, one can 
see from the definitions of errors of the three types that if a code C is capable of 
correcting £ or fewer symmetric errors, then it is capable of correcting £ or fewer 
unidirectional errors; if a code C is capable of correcting £ or fewer unidirectional 
errors, then it is capable of correcting £ or fewer asymmetric errors. Note that 
nonbinary codes for correcting asymmetric errors have been studied in [10], [12], 
[15], [16], [27], [29] and [32]. 

Remark 1.2. Weber et al. [32] gave necessary and sufficient conditions for a block 
code to be capable of correcting up to t\ symmetric errors, up to £2 unidirectional 
errors, and up to £3 asymmetric errors, as well as detecting from t\ + 1 to d\ sym- 
metric errors that are not of the unidirectional type, from £2 + 1 to unidirectional 

errors that are not of the asymmetric type, and from £3 + 1 to ds asymmetric errors. 
As special cases, Theorem 1.1 follows directly from these general necessary and 
sufficient conditions. In this paper, we only need to use Theorem 1.1 to establish 
our results. 

Let T m (n,£) denote the maximum size of an (n,M) m code which is capable 
of correcting £ or fewer symmetric errors and detecting all unidirectional errors 
with the magnitude error criterion. By Theorem 1.1, we know that T m (n,£) is the 
maximum size of an (n, M) m code C satisfying 7V(x, y) > £-f 1 and N( y, x) > £ + 1 
for all x, y E C and x^y. Note that binary codes capable of correcting £ or fewer 
symmetric errors and detecting all unidirectional errors have been studied in [1]— [7] 
and [10]-[36]. 

In this paper, based on residue rings of polynomials, we present a general 
construction for nonbinary codes capable of correcting £ or fewer symmetric errors 
and detecting all unidirectional errors with the magnitude error criterion. Some 
new lower bounds for such codes are obtained from this general construction. 
This paper is organized as follows. In Section 2, we review and derive some basic 
properties of generalized binomial coefficients introduced and studied in [8] and [9] . 
In Section 3, we present a general construction for nonbinary codes for correcting 
£ or fewer symmetric errors and detecting all unidirectional errors. In Section 4, 
some new lower bounds for r m (n, £) are given. 



2. Generalized Binomial Coefficients 

In this section, we review and derive some basic properties of generalized binomial 
coefficients introduced and studied in [8] and [9] . 




142 



Fang- Wei Fu, San Ling and Chaoping Xing 



Given three integers ra, n > 1 and r > 0, the generalized binomial coefficient 
( n ) is defined as follows: 

\rJm 



/1\ _ f 1, if0<r<m-l 

\r) m \ 0, otherwise 



and 




for n > 2. 



The following basic properties of generalized binomial coefficients are listed in [8, 
pp. 215-216]. 



Properties: 

(i) (^) m is the number of integer solutions to the equation 



x\ + x 2 H Yx n =r 



with 0 < Xi < m — 1 for each z = 1,2, -* 

ffl C)„ = 1; 

(iii) (") m = n, where m > 2; 

(») a„ = E".o(-i)‘(")r‘„ + :r”")i 

< v ) = C)„. where r + s = n(m-l). 

By Property (i), we have 



, n; 



= 0 for r < 0 or r > n(m — 1). 



Hence, we only need to consider (™) m for the case 0 < r < n(m — 1). Obviously, 
by Property (i), (™) m > 1 for 0 < r < n(m — 1). Below we derive the following 
unimodal property of (™) m - 



Lemma 2.1. For n> 2, 

(i) if n(m — 1) is even, then 



< 



<••• < 



• > 



n(m — l)/2 
n 

n(m — 1) — 1 



> • 



> 



n(m — 1) 



(ii) if n(m — 1) is odd, then 



< 



< • • < 



> 



[n(m — 1) — l]/2 



n(m — 1) — 1 



> 



[n(m - 1) + l]/2 
n 

n(m — 1) 



> 




Constructions of Nonbinary t-SEC/AUED Codes 



143 



Proof. By Property (v), we only need to prove that for n > 2 

n(m — 1) 



r — 1 



< 



for 1 < r < 



( 2 . 1 ) 



Below we prove (2.1) by mathematical induction. From the definition of (™) m , we 
have 



n 



\ i ( n — 1 

iL = ^=o 



r — i 



E m— i ( n — 1 
i = o 



o \r — 1 — i 



n — 1 
r 



n — 1 
r — m 



(2.2) 



For n = 2, it follows from (2.2) and the definition of (J;) that for 1 < r < m — 1, 



2 

r — 1 



1 

r — m 



= 1 - 0 = 1 > 0 . 



Hence, (2.1) is true for n = 2. Assume that (2.1) is true for n = k — 1. Now we 
prove that (2.1) is true for n = k. By (2.2), 

, 1 (2.3) 



fc 

r — 1 



k — 1 
r 



k-1 
r — m 



Next we consider three cases: 

(A) If 1 < r < m - 1, then ( r * Zl) m = 0. Hence, by (2.3), > ( r !j) m - 

(B) If to < r < , then r — m < r < ( fc ~ 1 K m ~ 1 ) _ Hence, by induction 

assumption, we have 



k - 1\ //c - r 

> .)>•> 
r / V r — 1 



k-1 
r — m 



By (2.3), this implies that 



> 



k 

r — 1 



(C)If 



and 



Hence, 






< r < 



k(m— 1) 



, then 






_ __ „ ( fc - 2 )( m - !) 

r — m < m < . 



r — m < (k — 1 )(m — 1) — r < 



(k — l)(m — 1) 



By (2.3), Property (v) and the induction assumption, 



k 

r — 1 



k-1 
(k — 1 )(m — 1) — r 



k-1 
r — m 



>0. 




144 



Fang- Wei Fu, San Ling and Chaoping Xing 



From the discussion in (A), (B) and (C), we see that (2.1) is true for n = k. Hence, 
by induction, (2.1) is true. □ 

The following result follows from Lemma 2.1 immediately. 

Proposition 2.2. 

( L»(m - l)/2j) m = maX0 ^^” (m - 1) (r) m 
where [x\ is the greatest integer less than or equal to x. 



3. A General Construction 

Xing [33] gave a construction of binary constant weight codes. By modifying his 
method, Fu, Ling and Xing [11] presented a general construction for binary asym- 
metric error-correcting codes. Bose and Rao [6] gave a construction of binary codes 
capable of correcting t or fewer symmetric errors and detecting all unidirectional 
errors by using binary constant weight codes. By modifying and generalizing the 
methods in [6], [11] and [33], we present a general construction for nonbinary codes 
capable of correcting t or fewer symmetric errors and detecting all unidirectional 
errors with the magnitude error criterion. 

Let F q be a finite field of q elements, where q is a prime power. Let F* be 
the set of nonzero elements of F 9 . For a monic polynomial f(x) G F g [x], consider 
the residue class ring 

R = F q [x}/(f(x)). 

For simplicity, in this paper, we can also make the following identification: 

R = {g(x) € Fq[x] : deg{g(x)) < deg(/(x))}. 

The addition and multiplication operations in R are the polynomial addition and 
multiplication modulo f(x). 

Let f(x) have the factorization 

f( x ) = n fc 

where pi(x),--- ,Pk(x) are distinct monic irreducible polynomials in ¥ q [x] and 
ei, • • • ,e/c are positive integers. It is known that all invertible polynomials of the 
ring R form a multiplicative group, denoted by G. It is a finite abelian group and 
consists of all polynomials in R which are co- prime to f(x), that is 

G = {g(x) e F,[x] : deg(^(x)) < deg(/(x)) and (g(x),f{x)) = 1}. (3.1) 

The multiplication operation Q over G is the polynomial multiplication modulo 
f(x). This group contains exactly 

*(/(*)) = n" M d '~ l )q d ' (e '- l) 



(3.2) 




Constructions of Nonbinary t-SEC/AUED Codes 



145 



elements, where di is the degree of Pi(x). Below we use the group G to construct 
nonbinary codes capable of correcting t or fewer symmetric errors and detecting 
all unidirectional errors with the magnitude error criterion. 

For 0 < r < n(m — 1), denote 

vn ( r ) = {y = (yi, 3/2, • • - ,2/n) e v n : r" Vi = r}. 

'*—-' 2=1 

By Proposition 2.2 and Property (i) of (™) , we have 

|F"(r)| = h 
V / m 

and 

iv», Wra -i,/ 2J )i . ( Wm ! 1)/2J ) m 

(n\ 

max 0 < r < n ( m _ 1 ) I I 
W m 

= max 0 < r < n(m _i)|P n (r)|. 

Construction. Let m, n and t be three positive integers satisfying m > 2, n < q 
and 1 < t < n(m — 1). Let f(x) G F q [x] be a monic polynomial of degree t such 
that there exist n distinct elements ai,a 2j *** , a n € F q with /(a*) ± 0 for all 
i = 1, 2, • • • , n. 

Since /(a*) ^ 0 for i — 1,2, ••• ,n, then ( x — a t ) is co-prime to f(x) for 
i = 1, 2, • • • , n. Hence 

(x - cti) E G, i = 1, 2, • • • , n. 

Consider the map 

f2 : V n (|n(m - 1)/2J) -> G, (ci,c 2 ,--- ,c„) ^ Q(x - aj) Ci G G. 

For every g(x) G G, denote 

c p = n-'igix)). 

For every g e G, if C g ^ 0, then C g is an ra-ary code C of length n capable of 
correcting t or fewer symmetric errors and detecting all unidirectional errors with 
the magnitude error criterion. 

Proof of the construction. By Theorem 1.1, we want to show that 
N( u, v) > t -f 1 and 7V(v, u) > t + 1 

for all u, v G and u ^ v. 

Let u = (ui,u 2 , • • • , u n ) and v = fyi,u 2 , • • • , v n ). Since 

u,vGC 5 C V n ([n(m - 1)/2J), 

then 

Yf i=1 Ui = Yf i=1 Vi = L n ( m - ! )/ 2 J 



(3-3) 




146 



Fang- Wei Fu, San Ling and Chaoping Xing 



and 





fi(u) = fi(v) = g(x) G G. 


(3.4) 


It follows from (3.3) that 






N(u,v) = N(v,u). 


(3.5) 


By (3.4), the element fi(u)/fi(v) is the identity of G. This implies that 


in the 


group G 


^(u) nr=i o(* - a i) ui i 

«(v) nr=iO(* -<*)** ■ 


(3.6) 


Denote 


S = {i : v x > u t } 




and 


T = {i : m > Vi}. 




Then Sf]T = 


0, and either S ^ 0 or T ^ 0 since u / v. Furthermore, 






W(u,v) =^2 ieS (Vi -Ui), 


(3.7) 




N(v,u ) = Yl jeT ( u i -Vj)- 


(3.8) 



It is easy to see from (3.6) that 

o(u) = n,gQ( i - a i)" r ” 1 = 1 
mm ~ riies o( x - Qi) v ’~ ui 

in the group G. This is equivalent to the fact that f(x) divides the polynomial 

= n J6T (* - a i) ui ~ vi - iu« - e f «n- 

The roots of the polynomial Yljer( x ~ a j) Uj ~ Vj are a jiJ ^ and th e roots of 
the polynomial ~ a?) Vl_Ui are G 5. Since 

{at :i e S} P|{«z :i G T} = 0 

and either 5 / 0 or T ^ 0, we have 

Hence, A(x) ^ 0. By (3.7), (3.8) and the fact that iV(u,v) = 7V(v, u), we know 
that the degree of A(x) is at most iV(u, v) — 1. Since A(x) ^ 0, we have 

7V(u, v) - 1 > deg(A(a;)) > deg(/(x)) = t. 



Hence, 



iV(u,v) = 7V(v,u) > t + 1. 



This completes the proof. 



□ 




Constructions of Nonbinary t-SEC/AUED Codes 



147 



From the construction, we know that C g , g G G form a partition of 
V n ([n(m — 1)/2J). Since | G \= <I>(/(x)), we can find one element 7 r(x) G G 
such that 



\C V \> 



(|n(m— l)/2j) m 

$(/(*)) 



Hence, we obtain the following result. 



(3.9) 



Theorem 3.1. Let ¥ q be a finite field of q elements , where q is a prime power. Let 
m, n and t be three positive integers satisfying m > 2, n < q and 1 < t < n(m — 1). 
Let f ( x ) G F q [x] be a monic polynomial of degree t such that there exist n distinct 
elements au,^,*** , a n G F q with f(ai) ^ 0 for all i = 1, 2, • • • , n. Then there 
exists an m-ary code C of length n and size 



\C\> 



(|n(m — l)/2j) m 

$(/(*)) 



(3.10) 



which can correct t or fewer symmetric errors and detect all unidirectional errors 
with the magnitude error criterion. 



From the construction, it is easy to see that 

Corollary 3.2. With notations as in the construction , we have 

r m(M) > ma x g€G | Cg I . (3.11) 

Bound (3.11) is in general stronger than Bound (3.10), but it is less explicit 
and requires more computation to determine. 

Remark 3.3. In the proof of the construction, if we define the map 0 : V n (r) — ► G 
in the same way, we obtain a code with at least (™) m /4> (/(x)) codewords. By 
Proposition 2.2, we take r = [n(m — 1)/2J in order to make the code size big. 



4. New Lower Bounds for r m (M) 

In this section, we show that some new lower bounds for r m (n, t) can be obtained 
from Theorem 3.1. Note that the lower bounds for T m(n,t) obtained by Theorem 
3.1 depend on the selection of f(x). It seems that the following selections of f(x) 
are optimal for the corresponding cases. 



Theorem 4.1. 



(i) If n is a prime power, n>m and 2 < t < n(m — 1), then 



r m(n,t) > 



(|n(m — l)/2j) m 

(n 2 — l) r (n 3 — l) s 



(4.1) 



where r and s are the two unique non-negative integers satisfying t = 2r + 3s 
and s G {0,1}. 




148 



Fang- Wei Fu, San Ling and Chaoping Xing 



(ii) If n is not a prime power and n > m, denote k as the least positive integer 
such that q = n + k is a prime power. If 2 < t < k, then 



r m(M) > 

If k < t < n(m — 1), then 
r m(n,t) > 



( [n(m— 1)/2J ) y 

(q-iy 

(|_n(m— l)/2j) T 



(q - 1 ) k q s '(q 2 - l) r '’ 



(4.2) 



(4.3) 



where r' and s' are the two unique non-negative integers satisfying t — k = 
2 r' -F s' and s' G {0, 1}. 



Proof (i) Let <7 = n in Theorem 3.1 since n is a prime power. Let 

F q = {ai,a 2 ,--- ,«<?}• 

Note that the number of monic quadratic irreducible polynomials over ¥ q is q(q — 
l)/2. Since n > m, we have 

^ t n{m - 1) n(n - 1) q(q - 1) 

r < — < < = . 

- 2 2“2 2 

Hence, we can choose r distinct monic quadratic irreducible polynomials 

Pi(x),P2{x),--- ,Pr(x) 

in Fq[x] and a monic cubic irreducible polynomial p(x) in F 9 [x]. Let 

f{x) =p s (^)H i=i Pi(^)- 

Then deg (f(x)) = t and 

(f(x)) = ( q 2 - 1 Y(q 3 - 1 ) s = (n 2 - l) r (n 3 - l) s . 

It is easy to see that /(a^) ^ 0 for all i = 1 , 2 , • • • , n. Hence, ( 4 . 1 ) follows from 
Theorem 3 . 1 . 

(ii) In Theorem 3 . 1 , let 

Fg {/^l 5 @2 •) * " ’ 5 fik ^2 ? * * " 5 } • 

If 2 < t < k, let 

f(x) = (x - Pi)(x - 02 )--'(x- fit)- 

Then 

*(f(x)) = (q- 1 )‘- 

If k < t < n(m - 1), by the fact that t — k = 2r' + s', we have 
, < t_ n(jn — 1) n(n - 1) < g(g - 1) 

“2 2 - 2 - 2' 

Hence, we can choose r 1 distinct monic quadratic irreducible polynomials 

Pl{x),p 2 (x),--- ,p T '(x) 




Constructions of Nonbinary t-SEC/AUED Codes 



149 



in F q [x\. Let 

f(x) = (x - /3i) 1+s '(x - &) • • • (a: - 0k)\\ i=1 Pi{x). 

Then deg (/(#)) = t and 

*(/(*)) = (<?- i)V'(g 2 - i) r '- 

It is easy to see that /(a*) / 0 for allz = 1, 2, • • • , n. Hence, (4.2) and (4.3) follow 
from Theorem 3.1. □ 



Letting k = 1,2 in Theorem 4.1(h), we obtain 

Corollary 4.2. 

(i) If n 4- 1 is a prime power and n> m, then for 2 < t < n(m — 1) 



r m(M) > 



(|n(m— l)/2j) m 

n(n + l) s [(n + l) 2 — l] r 



where r and s are the two unique non-negative integers satisfying t — 1 = 2 r-\-s 
and s 6 {0, 1}. 

(ii) If n + 2 is a prime power and n > m, then for 3 < t < n(m — 1) 



r m (n,t) > 



(|n(m-l)/2j) m 

(n + l) 2 (n + 2 ) s [{n + 2) 2 — l] r 



where r and s are the two unique non-negative integers satisfying t—2 = 2r+s 
and s G {0, 1}. 

The lower bound (4.1) in Theorem 4.1 can be rewritten as the following form. 
If n is a prime power, n> m and 2 < t < n(m - 1), then 

r m (n,t)> t even, (4.6) 

( n 2, — 1) 2 

r m (n,t)> ( L n ( m — ^ / 2 J ) m todd (4.7) 

(n 2 — (n 3 — 1) 

The lower bound (4.4) in Corollary 4.2 can be rewritten as the following form. If 
n- h 1 is a prime power and n > m, then for 2 < t < n(m — 1) 



r m (M) > 



^ [n(m— 1)/2J ) m 

i(n+l)[(n + l) 2 -l]^ 



t even, 



■p ( ^ \ ([7i(ra l)/2j) m 

1 l ) _ rt-n > 

n^n+l) 2 -!] 1 ^ 



t odd. 



(4.9) 




150 



Fang- Wei Fu, San Ling and Chaoping Xing 



The lower bound (4.5) in Corollary 4.2 can be rewritten as the following form. If 
n - b 2 is a prime power and n > m, then for 3 < t < n(m — 1) 



r m {n,t) > 
r m(n,t) > 



( |_n(m— 1)/2J ) r 



(n+l) 2 [(n + 2) 2 -l]^’ 



t even, 



( [n(m— 1)/2J ) T 



(n + l) 2 (n + 2)[(n + 2) 2 — 1] ( 2 } 



t odd. 



(4.10) 

(4.11) 



Acknowledgment 

The authors would like to thank the anonymous reviewer and Professor Harald 
Niederreiter for their valuable suggestions and comments that helped to improve 
the paper. 



References 

[1] D.K. Bhattacharyya and S.J. Nandi, Theory and Design of SEC-DED-AUED Codes. 
IEE Proceedings- Computers and Digital Techniques 145 (1998), 121-126. 

[2] M. Blaum, Codes for Detecting and Correcting Unidirectional Errors. IEEE Com- 
puter Society Press, Los Alamitos, California, 1993. 

[3] M. Blaum and H. van Tilborg, On t-Error Correcting /All Unidirectional Error De- 
tecting Codes. IEEE Trans. Computers 38 (1989), 1493-1501. 

[4] F.J.H. Boinck and H. van Tilborg, Constructions and Bounds for Systematic 
tEC/AUED Codes. IEEE Trans. Inform. Theory 36 (1990), 1381-1390. 

[5] B. Bose, On Unordered Codes. IEEE Trans. Computers 40 (1991), 125-131. 

[6] B. Bose and T.R.N. Rao, Theory of Unidirectional Error Correcting /Detecting Codes. 
IEEE Trans. Computers 31 (1982), 521-530. 

[7] J. Bruck and M. Blaum, New Techniques for Constructing EC/AUED Codes. IEEE 
Trans. Computers 41 (1992), 1318-1324. 

[8] C.C. Chen and K.M. Koh, Principles and Techniques in Combinatorics. World Sci- 
entific, Singapore, 1992, pp. 215-216. 

[91 C. Cooper and R.E. Kennedy, A Dice-Tossing Problem. Crux Mathematicorum 10 
(1984), 134-138. 

[10] P. Delsarte and P. Piret, Spectral Enumerators for Certain Additive- Error- Correcting 
Codes over Integer Alphabets. Inform. Contr. 48 (1981), 193-210. 

[11] F.-W. Fu, S. Ling and C.P. Xing, New Lower Bounds and Constructions for Binary 
Codes Correcting Asymmetric Errors. IEEE Trans. Inform. Theory 48(12) (2003), 
to appear. 

[12] T. Helleseth and T. Klpve, On Group- Theoretic Codes for Asymmetric Channels. 
Inform. Contr. 49 (1981), 1-9. 

[13] R.S. Katti, A Note on SEC/AUED Codes. IEEE Trans. Computers 45 (1996), 244- 
246. 

[14] R.S. Katti and M. Blaum, An Improvement on Constructions of t-EC/AUED Codes. 
IEEE Trans. Computers 45 (1996), 607-608. 




Constructions of Nonbinary t-SEC/AUED Codes 



151 



[15] T. Kl0ve, Error Correcting Codes for the Asymmetric Channel. Rep. 18-09-07-81, 
Dept. Mathematics, University of Bergen, July 1981 (revised in 1983 and updated 
in 1995). 

[16] T. Klpve, On Robinson’s Coding Problem. IEEE Trans. Inform. Theory 29 (1983), 
450-454. 

[17] S. Kundu and S.M. Reddy, On Symmetric Error Correcting and All Unidirectional 
Error Detecting Codes. IEEE Trans. Computers 39 (1990), 752-761. 

[18] C.-S. Laih and C.-N. Yang, On the Analysis and Design of Group Theoretical t- 
SYEC/AUED Codes. IEEE Trans. Computers 45 (1996), 103-108. 

[19] D.J. Lin and B. Bose, Theory and Design of t-Error Correcting and d(d > t)- 
Unidirectional Error Detecting (t- EC d-XJED) Codes. IEEE Trans. Computers 37 
(1988), 433-439. 

[20] D.J. Lin and B. Bose, On the Maximality of the Group Theoretic Single Error 
Correcting and All Unidirectional Error Detecting (SEC-AUED) Codes. Sequences: 
Combinatorics, Compression, Security, and Transmission, Springer- Verlag (Editor: 
R. Capocelli), pp. 506-529, 1990. 

[21] M.-C. Lin, Constant Weight Codes for Correcting Symmetric Errors and Detecting 
Unidirectional Errors. IEEE Trans. Computers 42 (1993), 1294-1302. 

[22] B.L. Montgomery and B.V. Kumar, Systematic Random Error Correcting and All 
Unidirectional Error Detecting Codes. IEEE Trans. Computers 39 (1990), 836-840. 

[23] S.J. Nandi and P.P. Chaudhuri, New Class of t-Error Correcting and All Unidi- 
rectional Error Detecting (t-EC/AUED) Codes. IEE Proceedings- Computers and 
Digital Techniques 142 (1995), 32-40. 

[24] D. Nikolos, Theory and Design of t-Error Correcting/d- Error Detecting (d > t) and 
All Unidirectional Error Detecting Codes. IEEE Trans. Computers 40 (1991), 132- 
142. 

[25] D. Nikolos and A. Krokos, Theory and Design of t-Error Correcting, k- Error De- 
tecting and d- Unidirectional Error Detecting Codes with d > k > t. IEEE Trans. 
Computers 41 (1992), 411-419. 

[26] T.R.N. Rao and E. Fujiwara, Error-Control Coding for Computer Systems. Engle- 
wood Cliffs, NJ: Prentice-Hall Inc., 1989. 

[27] J.P. Robinson, An Asymmetric Error- Correcting Ternary Code. IEEE Trans. Inform. 
Theory 24 (1978), 258-261. 

[28] D.L. Tao, C.R.P. Harmann, and P.K. Lala, A Note on t-EC/d-UED Codes. IEEE 
Trans. Computers 40 (1991), 660-663. 

[29] R.R. Varshamov, A General Method of Constructing Asymmetric Coding Systems, 
related to the Solution of a Combinatorial Problem Proposed by Dixon. Doklady 
Akad. Nauk. USSR 194 (1970), 284-287 (trans: Soviet Physics-Doklady 15 (1970), 
811-813). 

[30] J.H. Weber, Asymptotic Results on Codes for Symmetric, Unidirectional, and Asym- 
metric Error Control. IEEE Trans. Inform. Theory 40 (1994), 2073-2075. 

[31] J.H. Weber, C. de Vroedt and D.E. Boekee, Bounds and Constructions for Codes 
Correcting Unidirectional Errors. IEEE Trans. Inform. Theory 35 (1989), 797-810. 




152 



Fang- Wei Fu, San Ling and Chaoping Xing 



[32] J.H. Weber, C. de Vroedt and D.E. Boekee, Necessary and Sufficient Conditions on 
Block Codes Correcting /Detecting Errors of Various Types. IEEE Trans. Computers 
41 (1992), 1189-1193. 

[33] C.P. Xing, Constructions of Codes from Residue Rings of Polynomials. IEEE Trans. 
Inform. Theory 48 (2002), 2995-2997. 

[34] C.-N. Yang and C.-S. Laih, DCm Codes for Constructing t-EC/AUED Codes. IEEE 
Trans. Computers 47 (1998), 492-495. 

[35] Z. Zhang and X.-G. Xia, LYM-Type Inequalities for tEC/AUED Codes. IEEE Trans. 
Inform. Theory 39 (1993), 232-238. 

[36] Z. Zhang and X.-G. Xia, On the Construction of Systematic tEC/AUED Codes. 
IEEE Trans. Inform. Theory 39 (1993), 1662-1669. 



Fang- Wei Fu 

Temasek Laboratories, National University of Singapore 
Engineering Drive 3, 10 Kent Ridge Crescent 
Singapore 119260, Republic of Singapore 

on leave from 

the Department of Mathematics, Nankai University 
Tianjin 300071, P. R. China 
e-mail: tslfuf w@nus . edu . sg 

San Ling 

Department of Mathematics 
National University of Singapore 
2 Science Drive 2 

Singapore 117543, Republic of Singapore 
e-mail: matlingsQnus . edu . sg 

Chaoping Xing 
Department of Mathematics 
National University of Singapore 
2 Science Drive 2 

Singapore 117543, Republic of Singapore 
and 

Department of Mathematics 

University of Science and Technology of China 

Hefei, Anhui 230026, P. R. China 

e-mail: matxcpOnus . edu . sg 




Progress in Computer Science and Applied Logic, Vol. 23, 153-168 
© 2004 Birkhauser Verlag Basel/Switzerland 



On the Propagation Criterion 
of Boolean Functions 

Aline Gouget 



Abstract. The propagation criterion is one of the main cryptographic criteria 
on Boolean functions used in block ciphers. Quadratic Boolean functions sat- 
isfying the propagation criterion of high degree were given by Preneel et a/., 
but their algebraic degree is too small for a cryptograhic use. Then designing 
Boolean functions of high algebraic degree and high degree of propagation 
has been the goal of several papers. In this paper, we investigate the work 
of Kurosawa and Satoh in order to optimize the algebraic degree and the de- 
gree of propagation, and the work of Honda, Satoh, Iwata, and Kurosawa, by 
giving in particular a construction of Boolean functions satisfying PC( 3) and 
having a very large algebraic degree. We also show that among symmetric 
functions, only the quadratic ones satisfy the propagation criterion of degree 
greater than 1. A particular case of this result is that symmetric bent func- 
tions must be quadratic - a result that needed a whole paper to be proved 
before. 

Keywords. Boolean functions, Block-Cipher, Propagation criterion, Symmet- 
ric functions. 



1. Introduction 

The security of block ciphers, ( e.g ., DES, AES) is often discussed by viewing their 
encrypting functions (more precisely, their S-boxes) as a set of Boolean functions. 
Part of the security of the system relies on the choice of these Boolean functions 
which must fulfil several cryptographic criteria. Such a Boolean function / must 
be balanced , i.e ., take the value 1 and the value 0 with the same probability 1/2. 
The function / must also have high algebraic degree (the degree of its polynomial 
representation on n binary variables, called the algebraic normal form), and must 
satisfy the propagation criterion. 

In this paper, we focus on the construction of Boolean functions which sat- 
isfy the propagation criterion of degree /, have a high algebraic degree, and, in 
some cases, are balanced. In Section 2, we introduce the definitions and notation 




154 



A. Gouget 



that are needed in the paper. We briefly recall the most important results on the 
propagation criterion. 

In Section 3, we recall Kurosawa-Satoh’s construction [12] which is the first 
construction of nonquadratic Boolean functions satisfying PC(l) of order k (for 
some values l and k). This construction is a particular case of the famous Maiorana- 
MacFarland’s construction of those functions which can be written in the form 
f{x,y) = x • (j)(y) 0 g(y), where 0 is a mapping from F 2 into F 2 and g is any 
s-variable Boolean function, where s and t are two positive integers. Kurosawa 
and Satoh set the function to be linear. By using the properties of linear error- 
correcting codes, they obtained some n - variable Boolean functions satisfying the 
propagation criterion with high degree of propagation (l « n/ 4) and small order 
(k is a constant near 3) or vice versa. The functions they obtained have alge- 
braic degree at most n/2. We show that a slight modification of Kurosawa-Satoh’s 
construction leads to a construction of Boolean functions which have a degree 
of propagation at least as good as, and have a higher algebraic degree than the 
previous constructions; the order of propagation is no longer ensured. Carlet [4] 
generalized Kurosawa-Satoh’s construction by using a not necessarily linear map- 
ping </>; he constructed Boolean functions satisfying PC (l ) of order k by using 
nonlinear codes. We give a table of values for the different parameters (number of 
variables, algebraic degree, order and degree of propagation) which can be achieved 
by using the linear and nonlinear codes proposed by Kurosawa, Satoh, and Carlet. 
We propose to use another linear code, the parity check code, which allows us to 
obtain a higher degree of propagation (l ~ n/2) than the codes previously pro- 
posed. Furthermore, we give the values obtained by using several nonlinear codes 
(see [2, 15]) constructed by means of the Hensel lifting to Z4 of quadratic residue 
codes and the application of the Gray map. We also give a new construction which 
provides balanced Boolean functions having an odd number of variables and al- 
most the same values of parameters than these obtained with the parity check 
code. Next, Honda, Satoh, Iwata, and Kurosawa [11] obtained a construction of 
Boolean functions satisfying the propagation criterion of degree 2 and having high 
algebraic degree (d « n — log 2 n where n is the number of variables) by using the 
generator matrix of the simplex code. We generalize this construction by using 
a not necessarily linear mapping <f>. Furthermore, we adapt this construction to 
obtain Boolean functions satisfying PC( 3) and having the same algebraic degree. 
Finally, we propose a general construction for Boolean functions satisfying Odd- 
PC , a criterion introduced by Bernasconi [1]. 

In Section 4, we focus on the propagation criterion of symmetric functions, z.e., the 
functions which are invariant under any permutation of input coordinates. We show 
that the construction of PC( 1) symmetric functions is equivalent to the construc- 
tion of balanced functions. We recall a trivial construction of balanced symmetric 
functions for every odd number of variables n. By exhaustive search, we check that, 
for almost every odd integer n lower than 26, this trivial construction generates all 
symmetric balanced functions. Von zur Gathen and Roche [10] proposed several 
constructions of Boolean functions having numerical degree (defined in Section 2) 




On the Propagation Criterion of Boolean Functions 



155 



equal to n — 1. We use these constructions to provide PC( 1) Boolean functions. 
Savicky [19] proved that the only symmetric bent functions are the quadratic sym- 
metric functions. His proof needed a whole paper. We prove more generally, and 
in a few lines, that the symmetric functions satisfying PC(l) where l > 2, are the 
four quadratic functions ©!<*<,<„ x { Xj ©a 0 , and ©!<*<,-<„ x % x 3 x { ©a 0 , 

where ao is in F 2 . 

2. Preliminaries 

Let n be any positive integer. We denote by © the usual addition in F 2 and in F£ . 
The Hamming weight wh{u) of a word u in F£ is the number of its components 
equal to 1. We denote by -< the partial order on the words of F£ , ie., (i*i, . . . , u n ) ■< 
(vi , . . . , v n ) if and only if ( Ui = 1) => (vi = 1) for every i = 1, . . . , n. The Hamming 
weight wn{f) of an n-variable Boolean function / is the size of its support, i.e., 
the size of the set {x G F£ |/(x) = 1}. 

2.1. Representation of Boolean Functions 

Any Boolean function f in n variables, / : F£ F 2 , admits a unique algebraic 
normal form (ANF), f(x i,...,x n ) = @ ueF n a u {]\" =l x?) = ® ue¥ *a u x u . The 
function g : u i-> a u is called the binary Mobius transform of /. For any word u , 
the coefficient a u belongs to F 2 , and can be computed thanks to the formula a u — 
@vew%,v<uf( v )- The algebraic degree of a Boolean function / is the degree of its 
algebraic normal form. Every Boolean function / can also be uniquely represented 
by its Numerical Normal Form (NNF) [6], i.e., its polynomial representation over 
Z, f(xi , . . . ,x n ) = (niLi*n = For any Boolean function 

/ and for every word w in F 2 , the coefficient \ u can be computed thanks to 
the formula \ u — (— ^ vGF nj t; ^ u (— 1) WH ^ f(v). The numerical degree of a 
Boolean function / is the degree of its NNF representation. 

2.2. Criteria on Boolean Functions 

An n-variable Boolean function / is balanced if its Hamming weight equals 2 n_1 . 
The function x i-> f(x) © f(x © u), denoted by D u f , is called the derivative of 
/ over u. The Strict Avalanche Criterion (SAC) was introduced by Webster and 
Tavares [21] in 1985; it has been later generalized by Forre [9] who defined an 
order over it and by Preneel, Van Leekwijck and Van Linden [16] who defined the 
propagation criterion of degree l and order k. 

Definition 1. Let f be an n-variable Boolean function and l a positive integer. The 
function f satisfies the Propagation Criterion of degree l, denoted PC(l), if for all 
words u € F 2 such that 0 < wh{u) < Z, the function D u f is balanced. 

Definition 2. Let l and k be two positive integers and f an n-variable Boolean 
function. The function f satisfies the propagation criterion of degree l and order 
k ( PC (l ) of order k) if any function obtained from f by keeping constant k of its 
input coordinates satisfies PC(l). 




156 



A. Gouget 



The notion of order on the propagation criterion is related to the correlation- 
immunity, another cryptographic criterion, introduced by Siegent haler [20]. An n- 
variable Boolean function / whose output distribution probability does not change 
when at most k input coordinates are kept constant is called correlation-immune 
of order k. Furthermore, if the function / is balanced, then it is called k -resilient. 
The nonlinearity of / is its minimum distance to the set of all affine functions 
(the functions having algebraic degree at most equal to 1). A Boolean function / 
is called bent if its nonlinearity equals 2 n_1 — 2^ _1 , i.e ., the maximum possible 
value (n must be even). 



2.3. Properties and Constructions of Boolean Functions 

We are interested in the construction of Boolean functions having high algebraic 
degrees and satisfying the propagation criterion; these two criteria are partially 
opponent. Siegenthaler [20] showed that the algebraic degree of any function / 
satisfying correlation immunity of order k (0 < k < n) is upper bounded by n — k. 
Furthermore, if / is balanced and 0 < k < n — 1, then d is at most n — k — 1 and 
d = 1 if k = n — 1. The following upper bound on the algebraic degree of Boolean 
functions satisfying PC(1) of order A; is a direct consequence of this bound. 

Proposition 1. [16] Let f be an n-variable Boolean function. If f satisfies PC( 1) 
of order k, where 0 < k < n — 2, then f has algebraic degree at most n — k - 1. 

Rothaus [18] proved that a Boolean function / satisfies PC(n) if and only if it 
is a bent function; consequently, bent functions are also called perfectly nonlinear 
[13]. Rothaus also proved that the algebraic degree of any bent function is upper 
bounded by n/2. Zheng and Zhang established an explicit lower bound on the 
nonlinearity Nf of a function / satisfying PC(l) which shows that, the higher the 
degree of propagation, the higher the minimum nonlinearity. 

Proposition 2. [22] If f is an n-variable Boolean function satisfying PC(l), then 
the nonlinearity Nf of f satisfies Nf > 2 n_1 — 2 n-1- ^ z . 

We say that a Boolean function / linearly depends on x t if the function / 
actually depends on the variable Xi and Xi occurs in the ANF of / only in the 
monomial of degree 1. If a Boolean function / linearly depends on at legist one 
variable, then / is trivially balanced. This property is often used in order to prove 
the propagation criterion of Boolean functions. 

One of the most important classes of Boolean functions is obtained by Maio- 
rana-MacFarland’s construction. This construction was introduced in the 70’s by 
Dillon [8] in order to design perfectly nonlinear functions and it was later extended 
[3] to design resilient functions. Maiorana-MacFarland’s functions are defined by 
f{x,y) = x* (j>(y) ® g(y ), where x £ F|, y £ F 2 _s , g is any (n — s)-variable Boolean 
function and 0 is any mapping from F 2 -s into F|. 




On the Propagation Criterion of Boolean Functions 



157 



3. Design of Boolean Functions Satisfying PC(l) 

3.1. Maiorana-MacFarland’s Construction 

Kurosawa and Satoh studied in [12] a class of functions within the general class of 
the so-called Maiorana-MacFarland’s functions, f(x,y) = x-(j)(x , y)0#(y), where (j) 
is a mapping from F 2 into F| and g is any s- variable Boolean function, where s and 
t are two positive integers. They set the function (j) to be linear and gave a sufficient 
condition for such Maiorana-MacFarland’s Boolean functions to satisfy PC(l) of 
order k. This construction allowed them to construct Boolean functions having 
high degree of propagation and low order of propagation, or vice versa , according 
to their choices of the linear code used for the construction. Indeed, order and 
degree of propagation depend on the minimal distance and the dual distance of 
the linear code. Furthermore, this construction is the first one to provide functions 
having algebraic degree greater than 2. Carlet [4] generalized Kurosawa- Satoh ’s 
construction for not necessarily linear mappings and defined sufficient conditions 
on it to construct Boolean functions satisfying PC(l) of order k. 

Proposition 3. [4] Let f be a Maiorana-MacFarland function f(x,y) = x • 4>(y) 0 
g(y) where x E F|, y E F 2 and let g be any t-variable function. If the mapping (j) 
from F 2 into F| satisfies the following conditions: 

1. the sum of at least one and at most l coordinates of f> is k-resilient, 

2. if b E F| is such that 0 < wn(b) < l, then for every y E at least k + 1 

coordinates of the words (j)(y 0 b) and 4>{y) differ, 

then f satisfies the propagation criterion of degree l and order k. 

We recall that Kurosawa and Satoh set the function 0 to be linear. In order 
to construct a linear mapping (j), they proposed to use generator matrices of linear 
codes. A binary linear [n, k,d] code C is a /c-dimensional vector subspace of F£. 
Its minimal distance d equals the smallest positive Hamming weight of the words 
of C. The dual code C ^ is a linear code [n, n — k, d - 1 ] defined by C 1 - = {u E F£ | 
u-v = 0, Vv E C}. The dual distance of C is the minimal distance of C ± . Usually, 
a linear code C is represented by a k x n generator matrix G, whose rows form 
a basis of the vector space. Then, for all x E Fg, xG (usual matrix product) is a 
codeword. Furthermore, G is a parity check matrix of the code C 1 - ; that means, y 
is a codeword of C 1 - if and only if Gy T = 0. Furthermore, if H is an (n — k) x n 
generator matrix of C ± , then for all y E F^ - ^, yH is a codeword of C 1 - and x is 
a codeword of C if and only if Hx T = 0. 

Proposition 4. [12] Let G\ ( resp . G 2 ) be the generator matrix of a linear [m, Aq, d±] 
(resp. [ri 2 , code C\ (resp. C 2 ) of dual distance d± (resp. d ^). Then the 

function f(x , y) = x-(j)(y) 0p(y), where x E FJ , y E F 2 , g is any n-variable Boolean 
function and <j) is a mapping from F£ into F£ such that f>(y) = GTfG\y T , satisfies 
the propagation criterion of degree min(dj L , d^) — 1 and order min(di,d 2 ) — 1. 

Kurosawa- Satoh’s construction can be used to construct (2n)-variable 
Boolean functions of algebraic degrees at most n and satisfying PC(d ± — 1) of 




158 



A. Gouget 



order d — 1 . The maximum algebraic degree can be achieved by choosing a function 
g having maximum algebraic degree. Kurosawa and Sat oh [12] gave a necessary 
condition on the function / defined in Proposition 4 to be balanced. Indeed, / is 
balanced if: 

#{V I 9(y) = 0, G 7 < G x y = 0} = #{y | g(y) = l,C%Giy = 0}. 

In order to maximize the algebraic degree and the degree of propagation (with 
respect to the number of variables), we adapt Kurosawa-Satoh’s construction. 

Proposition 5. Let G be the generator matrix of a linear [n, k] code C of dual 
distance d - 1 and </> the mapping from into F§ defined by (j) : y i-> Gy T . Then , 
the function f(x,y) = x • 4>(y) 0 g(y), where x G F£ and y E F£, has algebraic 
degree at most n and satisfies the propagation criterion of degree d L — 1 . 

Proof We have to check Conditions 1 and 2 of Proposition 3 (with k = 0 and 
l = d L — 1). In the present case, Condition 1 is equivalent to saying that for every 
a in Frf such that 0 < wh{o) < d L — 1, we must have wn{aG ) > 1. The rows of G 
form a basis of C, then Condition 1 is fulfilled. Condition 2 is equivalent to saying 
that for every b in F£ such that 0 < wn(b) < d 1 - — 1, we must have wn(Gb T ) > 0. 
This condition is clearly fulfilled because b is not a codeword of C 1 - and G is a 
parity check matrix of C ± . □ 

Kurosawa and Satoh used several linear codes: the Hamming Code TL — 
[2 m — 1 , 2 m — m — 1 , 3] and its dual code the Simplex code TL L = [2 m — 1, m, 2 m_1 ], 
the first order Reed-Muller Code R(l,m) = [2 m ,ra + l,2 m-1 ] and its dual the 
extended Hamming code R L (l,m) = [2 m , 2 m — m — 1, 4]. We propose to consider 
another linear code which is the parity check code [m,m - 1,2] whose dual code 
has parameters [m, 1, m]. We give in Figure 1 the values of the different parameters 
(number of variables, degree of propagation and algebraic degree) for the construc- 
tions of Proposition 4 (by taking G\ = G 2 ) and Proposition 5, according to the 
choice of the linear code (the degree of propagation and the algebraic degree do 
not change between the two constructions, only the number of variables). Figure 1 



Linear Codes 


Nb. of Variables 
(Proposition 4) 


Nb. of Variables 
(Proposition 5) 


Algebraic 

Degree 


Degree of 
Propagation 


Um 


2 m +i - 2 


2 m+1 - m - 2 


2 m _ x 


2 m_1 - 1 


Km 


2 m +i - 2 


2 m + m - 1 


2 m _ x 


2 


R( 1, m) 


2 i rnJ r 1 


2 m + m -f 1 


2 771 


3 




2m+l 


2 771+1 - m - 1 


2 m 


2 771-1 - 1 


Parity check 


2m 


2 771 — 1 


m 


m — 1 



Figure 1 . Applications of Proposition 4 and 5 for linear codes 

shows that the functions obtained by Proposition 5 have very high algebraic de- 
grees. Furthermore, the use of the parity check code leads to the construction of 
(2m — Invariable Boolean functions (for m > 2) having algebraic degree m and 





On the Propagation Criterion of Boolean Functions 



159 



satisfying PC(m — 1); the degree l of propagation is near nj 2 instead of at most 
n/4 for the linear codes previously used. 

Carlet [4] generalized Kurosawa-Satoh’s construction by using two systematic 
nonlinear codes C\ and C2 (the notion of dual distance being still valid for non- 
linear codes), and proposed to use the (2 m , 2 2m-2m , 6) Preparata code Vm whose 
dual distance is 2 m ~ 1 — 2 2 ?' 1 and the (2 m , 2 2m , 2 m_1 — 2? _1 ) Kerdock code /C m 
whose dual distance is 6 (m even > 4; we give here the length, the cardinality and 
the minimum distance). We complete the table (see Figure 2) by considering four 



Nonlinear Codes 


Nb. of Variables 


Degree of 
Propagation 


Order of 
Propagation 


C\ = /Cm, C*2 = 'Prri 


2m+l 


2 rn ~ 1 - 2‘2 i_1 - 1 


5 


Cl — C 2 = JCm 


2m+l 


5 


2 m-i _ 2 2 2 ~ 1 - 1 


(36, 2 18 , 8) [2] 


72 


7 


7 


(48, 2 24 , 12) [2] 


96 


11 


11 


(64, 2 32 , 14) [15] 


128 


13 


13 


(96, 2 48 , 18) [15] 


192 


17 


17 



Figure 2. Values of parameters for Carlet ’s construction 

nonlinear codes obtained by Bonnecaze et al. [2] and Pless and Qian [15] from 
Hensel lifting to Z4 of quadratic residue codes and applying the Gray map. The 
best value for the degree of propagation obtained using those codes is near n/4. 

3.2. A New Construction of Balanced Boolean Functions Satisfying PC(l) 

We now propose another construction which allows us to obtain almost the same 
values of parameters (algebraic degree and the degree of propagation) than these 
obtained with the parity check code, for an odd number of variables. Furthermore, 
we give a necessary and sufficient condition on the function to be balanced. 

Proposition 6. Let n be any positive integer and f a (2 n + 1 ) -variable Boolean 
function such that f(x,y,z) = z(g(x)®yi®- • •©2/n)©£ , 2/> where x G F 2 , y G F 2 , 
z G F2 and g is any n-variable Boolean function. Then f has algebraic degree at 
most n and satisfies PC ( n ) . Furthermore, the function f is balanced if and only 

if g( 1) = i- 

Proof For all (a, b, c) such that 0 < wh{o) + wn(b) + wh(c) < n and a G FJ, 
b G F£, c G F 2 , we have D a ,b,cf{x, y , z) = z(g(x) ® g(x © a)) 0 c(g(x © a) © 2/1 © 
• • • © y n © &i © • • * © b n ) © z(bi ©•••©6 n )©a-y©6-x©a-6. If 0 < it;# (a) < n, 
then a • y © c(yi © • • • © y n ) is not equal to the null function and D a ^, c f linearly 
depends on at least one variable y % . If wh(cl) — n then wn(b) = wh(c) = 0 
and the derivated function is balanced. If wh{cl) = 0 and wh{c) 7^ 0, then the 
derivated function linearly depends on 2/1, 2/2, • • • , 2/n and it is balanced. At last, 
A),6 ,o/( x > 2 /> z ) = z (bi © * * * © b n ) © b • x is balanced. 





160 



A. Gouget 



The function / can be decomposed as follows: f(x , y, z) = (I0z)/i(x, y)0z/2fy , y), 
where fi(x,y) = x • y and / 2 (:r,y) = g{x) 0 {x 0 1) • y. Then, we have wn{f) — 
WH(fi) 0 WH(f 2 )- The Hamming weight of f\ is equal to 2 2n_1 - 2 n ~ 1 (see [14]). 
Thus, the function / is balanced if and only if the Hamming weight of / 2 equals 
2 2n— i +2 "-i. We have ^(/ 2 ) = 2 n w H {g) + {2 2n ~ 1 -2 n ~ l ) -2#{(x,y) E F£xF£ | 
g(x) = 1, (x 0 f) • y = 1}. For every x € F 2 such that g(x) = 1 and x ^ 1 , we 
have #{y G F^ | {x 0 f) • y = 1} = #{y E F£ | {x 0 1) • y = 0} - 2"" 1 . So, 
w H (f2) = 2 n w H (g) + (2 2n_1 - 2 n_1 ) - 2 n #{x E F? | yfy) = l,x ^ 1}. Thus, 
whUz) equals 2 2n_1 0 2 n_1 if and only if g{x) — 1 where x = 1 . □ 

Since there is no strong condition on the function g (only for x = 1, g(x) = 1), 
this construction provides balanced Boolean functions in (2n0l) variables, having 
algebraic degree equal to n and satisfying PC(n). 

3.3. Honda et a/’s Construction and Improvements 

Honda et al. [11] studied a class of functions also related to linear codes but which 
are not Maiorana-MacFarland’s functions. They set the linear code to be the binary 
Simplex code and then got a construction of n - variable Boolean functions satisfying 
PC (2) and having algebraic degree near n — log 2 n. 

Proposition 7. [11] Let m be a positive integer and G the generator matrix of the 
[2 m — l,m, 2 m_1 ] simplex code. We assume that the ith column of G is the binary 
representation of the integer i. Let f be a (2 m 0m — 1 ) -variable Boolean function 
such that f(x,y,z) = fi(x)®f 2 (y)®f 3 (z)®xG[yi,. . .,y 2 m_ 2 ,z] T , where x € Wf , 
y 6 F| ~ 2 , z G F' 2 , and f\ , f 2 and f 3 are any Boolean functions. Then f has 
algebraic degree at most 2 m — 2 and satisfies PC( 2). 

This proposition shows the existence of n - variable Boolean functions having 
algebraic degrees near n — log 2 n and satisfying PC(2). We can generalize this con- 
struction by replacing the mapping (y, z) ► G[y \, . . . , y 2 rn - 2 , z] T by the mapping 
(y,z) h- > </>(y, z); note that this mapping is not necessarily linear. 

Proposition 8. Let f be an (s 0 t) -variable Boolean function defined as follows: 

f{x, y, z) = fi(x) 0 / 2 (y) 0 fo(z) 0 x • 0(y, z), 

where x E F 2; y E F 2 _1 , z E F 2 , fi, / 2 and fy are Boolean functions and 0 is a 
mapping from F 2 into F 2 . If the mapping <p satisfies the following conditions, then 
f satisfies PC (2). 

1. Every component of (f linearly depends on z, i.e., </>i(y, z) = h{(y) 0z where 
hi is a (t — 1 ) -variable Boolean function, 

2. cj)i{y,z) and <j>i(y,z) 0 cf)j(y,z) are balanced for i ^ j where i and j are in 

{!>•••, 4 , 

3. (f>{y , z) (j){y®b,z®c) for every b in F^ -1 such that 0 < wn(b)+WH(c) < 2. 

Proof The function / satisfies PC (2) if the function x,y,z D a f\(x) 0 A >/2 (y)0 
Dch(z) 0 x • (0(y, z) 0 </>(y 0 6, z 0 c)) 0 a • 0(y 0 6, z 0 c) is balanced for all (a, b , c) 




On the Propagation Criterion of Boolean Functions 



161 



such that 0 < wh(o) + wn(b) + wh{c) < 2. If wh{o) = 0, then the function is 
balanced thanks to Condition 3. Otherwise, if wn{b) = 0, then D a ^ jC f(x,y,z) = 
f ) a/iW0f ) c/3^)©^ , W?/,2)®^(i/,2®c))©a^(y,20c). Thanks to Condition 1, 
z ) ® 0(y, z ® c) is constant, and the function is balanced thanks to Condition 
2 (and D c f 3 (z) is constant). Finally, if wn(a) = wn(b) = 1, then wh(c) = 0 and 
Da ib , c f(x,y,z) = D a fi(x) 0 D b f 2 (y ) © x ■ (< j)(y,z ) 0 cf)(y 0 6,2)) 0 a • <p(y © 6,2). 
Since x • (</>(?/, 2) 0 <f>(y 0 6, z)) does not depend on z and a • (j){y 0 6, 2) linearly 
depends on 2 (thanks to Condition 1), the function is balanced. □ 

The functions constructed by the previous proposition are not necessar- 
ily PC( 3). We propose an adapted construction for Boolean functions satisfying 
PC( 3). 

Proposition 9. Let f be a Boolean function defined as follows: 

f(x,y,z) = fi(x) 0 / 2 (2/)0x- </>(y) 0 21 (xi 0 • • • 0 x 8 ) 0 22(2/1 0 • ■ • 0 yt), 

where x G F|, y G F|, 2 G F|, fi, / 2 are Boolean functions and <f> is a mapping 
from ¥2 into F|. If the mapping <j> satisfies the following conditions, the function 
f satisfies PC( 3): 

1. for every i and j such that 1 < i < j < s, the functions y (f>i(y) 0 4 >j(y) 
and y <j)i(y) 0 <t>j{y) 0 yi 0 • • • 0 yt are balanced. 

2. If b G F2 is such that wjj{b) = 2 , then for every ye ¥2, at least one and at 
most t — 1 coordinates of the words (j)(y 0 b) and <fi(y) differ. 

Proof. If wn(a) or wn(b) is odd then D a , b , c f linearly depends on zi and/or z 2 
and it is balanced. In the following, assume that wn(a) and wn(b) are even. If 
w H (a) = w H (b) = 0, then D 0 ,o, c f(x,y,z) = ci(xi0- • •0x s )0c 2 (?/i0- • -02/t) is an 
affine nonconstant function since wh(c) is positive. If wh(o) = 2 then wn(b) = 0 
and D a £, c f(x,y,z) = D a fi(x) 0 a • <f>(y) 0 a(xi 0 • • • 0 x s ) 0 c 2 (s/i © * * * © yt), 
and the function is balanced thanks to Condition 1. Indeed, either c 2 = 0 and 
then we know that 4 >i(y) © 4 >j(y) is balanced for i ■=/=■ j, or c 2 = 1 and we know 
that <j>i(y) © <l>j(y) © yi © * • • © yt is balanced. If wn{b) = 2 then wh(cl) = 0 and 
Do, b , c f(x, y, z) = D b f 2 (y)@x-(<l)(y)®<l)(y®b))®ci(xi®- • -©x s )©c 2 (2/i ©• • -©2/t)* 
and either c\ = 0 or c\ = 1. Condition 2 ensures that WH( 4 >(y)(B 4 >(y($b)) £ {0,^}, 
and so the function linearly depends on at least one variable Xi . □ 

3.4. Odd-Propagation Criterion 

Zheng and Zhang [22] proposed several constructions of Boolean functions satis- 
fying the propagation criterion on almost all vectors of F£ , i.e., f(x) © f(x © u) is 
balanced for all word u in F %\A where A is a subset of FJ . Next, Bernasconi intro- 
duced the notion of odd-PC , that is, the property for a Boolean function to satisfy 
the propagation criterion for every word u of odd Hamming weight. The main 
motivation for introducing the class odd-PC is that bent functions both achieve 
the highest nonlinearity 2 n ~ l — 29 ~ 1 and satisfy the propagation criterion with 
respect to all nonzero vectors (these two conditions are equivalent to each other). 
But bent functions are not balanced and have algebraic degree at most n/ 2 . So 




162 



A. Gouget 



they can not be used for cryptographic applications. The class of odd- PC functions 
includes bent functions and balanced functions with high algebraic degree. 

Definition 3. A Boolean function f belongs to the class odd-PC if and only if it 
satisfies the propagation criterion with respect to any word u £ F£ of odd Hamming 
weight, i.e., for every word u such that wh{u) = 1 mod 2, the function x i— > 
f(x) 0 f(x 0 u) is balanced. 

Zheng and Zhang constructed n-variable Boolean functions satisfying PC 
for every word u in except for u in {(0 . . . 0), (10 . . . 0)}: the functions / = 
xi 0 g(x 2 , . . . , x n ) where g is a bent function. Using this construction, Bernasconi 
proposed a way to obtain odd-PC functions having the best algebraic degree avail- 
able, z.e., n — 1. Indeed, since an odd-PC function is PC'(l), the bound on the 
algebraic degree of PC( 1) functions is also valid for odd-PC functions. 

Proposition 10. [1] For any n > 3, there exists an explicit balanced and odd-PC 
Boolean function whose algebraic degree d is equal to n — 1 . 

In order to prove this proposition, Bernasconi constructed one such function 
by induction from a 3- variable function (see [1] for more details). We give a general 
direct construction of odd-PC functions having algebraic degree at most n — 2. 

Proposition 11. Let x £ F^ -2 and z £ F 2 . If f is an n-variable function such that 
f(x , z) — g(x) 0 z\{x\ 0 • • • 0 x n -2 0 Z 2 ), where g is any (n — 2) -variable Boolean 
function, then f satisfies odd-PC. Furthermore, f is balanced if and only if g is 
balanced. 

Proof. We have to show that the function D a , c f is balanced for all (a, c) such that 
a £ F ?? -2 , c £ F | and wh(cl) + wh(c) = 1 mod 2 . If c\ = 0, then wh{o) + C 2 is 
odd and the function D a , c f linearly depends on z\ and is balanced. If c\ — 1 then 
the derivated function linearly depends on 22 . 

For every x in F^ -2 such that g(x) = 1, we have #{z £ F 2 | z\(x\ 0 • • • 0 x n -2 0 
Z 2 ) = 0} = 3, and for every x in F^ -2 such that g(x) = 0, we have #{z £ F 2 | 
Zi(xi 0 • • • 0 x n —2 0 Z 2 ) = 0} = 1. So w H (f) = 3 w H (g) + (2 n-2 - w H {g)) = 
2 n ~ 2 + 2 wn(g) and / is balanced if and only if g is balanced. □ 



4. Propagation Criterion for Symmetric Boolean Functions 

To make the computation of the ciphertext from the plaintext more efficient, Dae- 
men et al. [7] proposed to use symmetric functions. This obviously presents the risk 
of allowing attacks using the specificities of these functions. For instance, Savicky 
proved in [19] that all symmetric bent functions are quadratic, z.e., have algebraic 
degree 2 (so they are not proper for cryptographic use). He needed a whole paper 
to give such proof. We prove in a shorter way that, more generally, nonquadratic 
symmetric functions cannot satisfy PC (2). Furthermore, we give constructions of 
nonquadratic symmetric Boolean functions satisfying PC( 1). 




On the Propagation Criterion of Boolean Functions 



163 



Definition 4. An n-variable Boolean function f is called a symmetric function if it 
is invariant under permutation of the variables, i.e., V7 r E S n , /(x^), . . . , x n ^) = 
f {x\ , . . . , x n ) . 

The algebraic normal form of a symmetric function is of the form: 

n 

f( x ) = (& a i( ® *“), 

i = 0 u e FJ 

wh(u) = i 

where at is in F2. Also, there exists a function /# : {0, . . . , n} 1— ► F2 such that 
f(x) = f^{wn{x)) for every x E F^. 

4.1. Construction of Symmetric PC(1) Boolean Functions 

In this paper, we are interested in the propagation criterion for symmetric func- 
tions and more precisely in the construction of such functions. We first notice 
a link between the constructions of balanced symmetric functions and of PC( 1) 
symmetric functions. Every symmetric n-variable Boolean function / can be writ- 
ten / = x\fi ® /2 where f\ and fa are two symmetric (n — Invariable Boolean 
functions. Since / is a symmetric function, it satisfies the propagation criterion of 
degree 1 if and only if D ei f is balanced, i.e., if f\ is balanced. Thus, the construc- 
tion of PC( 1) symmetric functions in n variables is equivalent to the construction 
of balanced symmetric (n — Invariable functions. Indeed, the knowledge of /1 
uniquely determines /2, up to constants. More precisely, if (ao, . . . ,a n -i) are the 
coefficients of the elementary symmetric functions © u | WH ( u )=i xU in the ANF of 
/1, then /2 can be computed as follows. 

n — 2 

M x )=(B ai ( ® x u )+cst. 

i=0 u€F £ _1 

wh (u) = i + 1 

Furthermore, the construction of balanced functions is equivalent to the construc- 
tion of Boolean functions having numerical degree at most n— 1. Indeed, Carlet and 
Guillot [6] showed that a function f(x) is balanced if and only if /(x)®#!®* • *®x n 
has numerical degree at most n — 1. Von zur Gathen and Roche [10] proposed sev- 
eral constructions of symmetric Boolean functions having numerical degree at most 
n — 1 while they were working on the degree of polynomials in R[x] that take only 
two values on the domain {0 , . . . , n}. 

For every positive integer n, the symmetric affine functions are balanced. For 
an odd number of variables, we first recall the trivial construction of balanced 
symmetric functions. 

Proposition 12. Let n be an odd positive integer, and f an n-variable symmetric 
Boolean function. If f(u) = /(w® 1 ) ® 1 for every word u in , then the function 
f is balanced. 




164 



A. Gouget 



Proof. First recall that (™) = ( V) for every i — 0, Then, the support 

of such symmetric function / contains for all z in {0, . . . , n} either the words of 
Hamming weight z or the words of Hamming weight n — i (but not both) . We need 
i ± n — i for every z, that is, n odd. The function is then balanced. □ 

It can be checked that, for any odd integer n lower than or equal to 25 and 
different from 13, the symmetric functions constructed by Proposition 12 are the 
only symmetric ones. For n = 13, von zur Gat hen and Roche obtained a nontrivial 
construction of n- variable symmetric balanced functions having algebraic degree 
equal to n — 1 . 

Proposition 13 . [10] Let n be a positive odd integer, k an integer such that 2 < 
k < (n — 3)/2. Let f be an n-variable Boolean function whose support is such that: 
supp(f) = \u G F 2 | wh{u) G {k — 2,k — l,n — k — 1, rz — &}}. The function f has 
numerical degree less than n if and only if n — 4 1 2 — 3 and k — 2 1 2 — t — 1 when 
t > 2 . 

A Boolean function has numerical degree less than n if and only if its support 
contains the same number of words of odd and even Hamming weights. The proof 
consists in solving the equation ( fc ™ 2 ) + ( n -/c-i) ~ (fc-i) + in-k ) the P oss it>le 

values of k and n when n is odd. From the previous proposition and Proposition 
12, we can deduce a construction of balanced symmetric functions. Indeed, instead 
of taking a balanced Boolean function / whose support is such that, for every i in 
{0, . . . , n}, the words whose Hamming weights are either i or n — i (but not both) 
are in the support of /, we search the functions / whose support contains the words 
of Hamming weight either k — 2,n — k + 2, k + l,n — k — lor k — 1, rz — k + l,k,n — k, 
and either z or rz — z for the other values. 

Corollary 1 . Let n be a positive odd integer, k and t two integers and b an element 
of F 2 . The following function f is balanced if and only if n = 4 1 2 — 3 and k = 
2 t 2 -t- 1 . 

b if w H (x) G {k - 2,k + l,n — k - l,n — + 2} 

f(x) = < 50 l if wh{x) G {k — l,k,n - k,n — k + 1} 

f(pc 0 1) ® 1 otherwise 

Proof. Let /' be a symmetric Boolean function such that f'(x) = f(x) 0 1 for 
all x of Hamming weight in {^ 1 ,^ 25 ^ 35 ^ 4 } where v\ = k — 2 or v\ = n — k -\- 2, 
and V2 = k 1 or V2 = n — k — l, v% = k — l or v$ = n — k + l, and V4 = k or 
V4 = n — k, and f(x) = f'{x) for other values. We choose the values v\, V2 , ^3 
and V4 in order to get the property f'(x) 0 1 = f'(x 0 1) for all x G F 2 . Then, we 
have w H (f) = 2-1 and w H (f) = w H (f) ± [(”) + ( fc ^) - ( fc " x ) - ( fe " 2 )]- The 

function / is balanced if and only if (^) + ( fc ” 1 ) = (^J 0 ( fc ^ 2 )- s °l uti i ons °f 
this equation are given by Proposition 13. □ 




On the Propagation Criterion of Boolean Functions 



165 



All the 13- variable balanced symmetric functions can be constructed from 
Proposition 12 and Corollary 1. Furthermore, Corollary 1 gives a way to con- 
struct balanced symmetric Boolean functions for a number of variables n equal to 
13,33,61,97,..., and then a way to construct PC( 1) symmetric functions for a 

number of variables n = 14, 34, 62, 98, For an even number of variables n, von 

zur Gathen and Roche’s constructions (presented in [10]) of Boolean functions hav- 
ing numerical degree at most n — 1 provide balanced symmetric functions thanks 
to a result of Carlet and Guillot previously recalled. By using exhaustive search of 
balanced symmetric functions for a low number of variables, we observe that all 
the functions are described by the different constructions which can be found in 
[10]. The first balanced symmetric functions which are not characterized by von 
zur Gathen and Roche exist for n = 24. We give the truth- tables of all balanced 
symmetric functions for n = 24 (except affine functions) in Figure 3 where bi is in 
F2 and bi == bi ® 1. The Walsh spectrum of an n - variable symmetric function / is 



W H ( x ) 


0 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


f iW 


0 


1 


0 


bi 


1 


1 


0 


b 2 


0 


bs 


1 


0 


1 


/ 2(3) 


0 


1 


1 


1 


0 


bi 


1 


b 2 


0 


bs 


1 


64 


0 



w H ( x ) 


13 


14 


15 


16 


17 


18 


19 


20 


21 


22 


23 


24 


fl ( x ) 


0 


1 


63 


0 


62 


0 


1 


1 


bi 


0 


1 


0 


Mx ) 


64 


1 


63 


0 


62 


1 


bi 


0 


1 


1 


1 


0 



Figure 3. Balanced symmetric function for n = 24 (ANF 1 and 
ANF 2) 



defined by the list of the following n + 1 values: 



£ (-i) /(3:) > £ (-i) /wex s £ (-i) /( 



x)@Xi@Xj 



\.xeF% 






xeF? 



x)©xi©--©x n 

x^F- 

The Walsh spectrum is often studied because several cryptographic criteria (as, 
e.g ., resiliency) can be characterized by it. The Walsh spectrum of the functions 
having ANF 1 and ANF 2 has the particularity that it contains only one zero. We 
give the Walsh spectrum of the function fi with b\ = 62 = 1 and 63 = 0: 

(0, 362296, 123568, -62552, -12448, 8152, -3184, 72, 3264, -1672, -1936, 2024, 
1056, -2024, -688, 1672, 384, -72, 3632, -8152, -44832, 62552, 538384, -362296, 

- 9814464). 





4.2. Propagation Criterion of Degree Greater Than 1 

We are finally interested in the construction of symmetric Boolean functions sat- 
isfying PC(l) where l > 2. Savicky [19] proved that the symmetric bent functions 




166 



A. Gouget 



(i.e., satisfying PC(n)) are quadratic. Preneel et al. [16] proved that quadratic 
symmetric functions satisfying PC(l ) of order k exist if/c-h / <n — l or if k + l = n 
and k even. Furthermore, they proved that quadratic functions are the only (not 
necessarily symmetric) functions satisfying PC (2) of order n — 2. Carlet [4] showed 
that the Boolean functions / which satisfy PC {l ) of order n — l are the four sym- 
metric quadratic Boolean functions. We prove here that the symmetric quadratic 
functions are the only symmetric functions satisfying PC(l) when l > 2. 

Theorem 1 . The only symmetric Boolean functions which satisfy the propagation 
criterion of degree l where 2 < l < n are the quadratic symmetric functions . 
Furthermore, if n is even, the quadratic functions also satisfy PC(n). 

Proof Any symmetric function / can be written f(x) — x\X 2 f \ © £ 1/2 © ^ 2/3 ® fa 
where fa, fa fa and fa are (n — 2)-variable symmetric functions. As / is symmetric, 
only D ei f, D e2 f and D €l + e2 f have to be considered. Then, D ei + €2 f = (1 © 
x\ © X 2 ) fi © fa © fa. Since the function / is symmetric, we have fa — fa and 
D ei + e2 f(x) = (l©xi©x 2 )/i. Thus, D €l j r€2 f is balanced if and only if fa identically 
equals 1. We can deduce that every symmetric Boolean function / satisfying PC (2) 
has algebraic degree 2. Conversely, the derivative D u f of a symmetric quadratic 
function / is a non-constant affine function except if n is odd and u = 1 . □ 

Remark 1 . Conversely, the same proof can be used to show that if f is a quadratic 
Boolean function satisfying PC (2), then f is a symmetric Boolean function. 

Corollary 2. Let f be an n-variable Boolean function satisfying PC(l). If f can be 
decomposed in one of the following two forms : 

1. f(x) = fi(x \, . . . ,x p ) © fa{xp-\~i, . . . ,x n ), where fa is a p-variable symmetric 
function and fa an (n — p) -variable Boolean function; or 

2. f(x i,...,x n ) = fa(xi,...,x n )<Sfa(xi,...,Xi,...,Xj, ...,Xn), where fa is an 
n-variable symmetric function and fa is an (n — 2 ) -variable Boolean function; 

then either fa is quadratic or l is at most 1 . 

Proof. Suppose l > 2. Since / is PC (l ), the function D ei + ej f is balanced. For any 
(i, j) such that 1 < i < j < p, we have D ei + ej f — D ei + ej fa. Thus, the symmetric 
function fa satisfies PC (2). From Theorem 1, either the function fa is quadratic 
or the hypothesis l > 2 is false. □ 

References 

[1] A. Bernasconi, On Boolean functions satisfying odd order propagation criteria, 3rd 
International Workshop on Boolean Problems, IWSBP’98, (1998), 117-124. 

[2] A. Bonnecaze, P. Sole, and A.R. Calderbank, Quaternary quadratic residue codes 
and unimodular lattices, IEEE Transactions on Information Theory, 41 (1995), 366- 
377. 

[3] P. Camion, C. Carlet, P. Charpin, and N. Sendrier, On correlation-immune functions, 
in Advances in Cryptology, Proc. of Crypto ’91, LNCS 576 (1991), 86-100. 




On the Propagation Criterion of Boolean Functions 



167 



[4] C. Carlet, On the propagation criterion of degree l and order k , in Advances in 
Cryptology , Proc. of EURO CRYPT’ 98, LNCS 1403 (1998), 462-474. 

[5] C. Carlet, On cryptographic propagation criteria for Boolean functions, Special Issue 
on Cryptology of Information and Computation 150 (1999), 32-56. 

[6] C. Carlet and P. Guillot, A new representation of Boolean functions, AAECC, (1999), 
94-103. 

[7] J. Daemen, R. Govaerts, and J. Vandewalle, A practical approach to the design of 
high speed self-synchronizing stream ciphers, Singapore ICCS/ISITA ’92 Conference 
Proceedings, IEEE, (1992), pp. 279-283. 

[8] J. F. Dillon. Elementary Hadamard Difference sets, Ph. D. Thesis, Univ. of Mary- 
land, 1974. 

[9] R. Forre, The strict avalanche criterion: spectral properties of Boolean functions and 
an extended definition, in Advances in Cryptology, Proc. of CRYPTO’88, LNCS 403 
(1989), 450-468. 

[10] J. von zur Gathen and J. Roche, Polynomials with two values, Combinatorica , 17 
( 3 ) (1997), 345-362. 

[11] T. Honda, T. Satoh, T. Iwata, and K. Kurosawa, Balanced Boolean functions satis- 
fying PC{ 2) and very large degree, in Proceedings of SAC’97, (1997), 64-72. 

[12] K. Kurosawa and T. Satoh, Design of SAC/PC(l) of order k Boolean functions 
and three other cryptographic criteria, in Advances in Cryptology, Proc. of EURO- 
CRYPT ’97, LNCS 1223 (1997), 434-449. 

[13] W. Meier and O. Staffelbach, Nonlinearity criteria for cryptographic functions, in 
Advances in Cryptology, Proc. o/ EURO CRYPT ’89, LNCS 434 (1990), 549-562. 

[14] V.S. Pless and W.C. Huffman (Eds.), The Handbook of Coding Theory, North- 
Holland, New York, 1998. 

[15] V.S. Pless and Z. Qian, Cyclic codes and quadratic residue codes over Z 4 , IEEE 
Transactions on Information Theory , 42 ( 5 ) (1996), 1594-1600. 

[16] B. Preneel, W. Van Leekwijck, L. Van Linden, R. Govaerts, and J. Vandewalle, 
Propagation characteristics of Boolean functions, in Advances in Cryptology, Proc. 
of EUROCRYPT’90, LNCS 473 (1991), 161-173. 

[17] B. Preneel, R. Govaerts, and J. Vandewalle, Boolean functions satisfying higher-order 
propagation criterion, in Advances in Cryptology, Proc. of Eurocrypt ’91, LNCS 547 
(1991), 141-152. 

[18] O.S. Rothaus, On bent functions, Journal of Combinatorial Theory (A), 20, (1976), 
300-305. 

[19] P. Savicky, On the bent functions that are symmetric, European J. of Combinatorics , 
15 (1994), 407-410. 

[20] T. Siegenthaler, Correlation- immunity of nonlinear combining functions for cryp- 
tographic applications, IEEE Transactions on Information Theory , 30 ( 5 ) (1984), 
776-780. 

[21] A.F. Webster and S.E. Tavares, On the design of S-box, in Advances in Cryptology, 
Proc. 0 / CRYPTO ’85, LNCS 218 (1986), 523-534. 




168 



A. Gouget 



[22] Y. Zheng and X. M. Zhang, On relationships among avalanche, nonlinearity, and 
correlation-immunity, in Advances in Cryptology, Proc . of ASIACRYPT’OO, LNCS 
1976 (2000), 470-482. 



Aline Gouget 

GREYC, Universite de Caen 
F- 14032 Caen Cedex, France 
e-mail: gouget@info.unicaen.fr 




Progress in Computer Science and Applied Logic, Vol. 23, 169-176 
© 2004 Birkhauser Verlag Basel/ Switzerland 



On Certain Equations over Finite Fields 
and Cross- Correlations of m - Sequences 

Tor Helleseth, Jyrki Lahtonen and Petri Rosendahl 



Abstract. We study the number of solutions to certain equations over finite 
fields and show how this gives a family of four- valued cross-correlation func- 
tions of binary m- sequences. This new family includes both of the four- valued 
cross-correlations found by Niho. 

Mathematics Subject Classification (2000). Primary 11T55; Secondary 94A55. 
Keywords. Finite fields, Cross-correlation, ra-sequences. 



1. Introduction 

For the theory of finite fields, their equations and characters we refer to [7] and [6]. 

The finite field with q = p k elements is denoted by GF(q). Later, when study- 
ing cross-correlation functions of binary m-sequences, we will restrict ourselves to 
the case p = 2. 

Let y E GF(q 2 ) \ {0}, and denote y q = y. We will find the possible number 
of solutions to 

f x pS+1 +yx pS -yx - 1 = 0 , . 

\ x q+1 = 1. 

The motivation to study this kind of equations comes from a cross-correlation 
problem for m-sequences. However, this equation is interesting in itself. We will 
see that it, in some sense, behaves like an affine equation over the subfield. In fact, 
our treatment is based on this idea. 

In the binary case, the possible number of solutions to the above equation 
gives the possible values taken by the cross-correlation function of two binary 
m-sequences of period 2 n — 1 which differ by the decimation 

d = (2 2k + 2 S+1 - 2 k+1 - 1)/{2 S - 1 ), ( 1 . 2 ) 

where we have assumed that n = 2k and that 2s divides k. It turns out that the 
cross-correlation function is four- valued. 




170 



T. Helleseth, J. Lahtonen and P. Rosendahl 



Finding the distribution of the values taken by the cross-correlation corre- 
sponding to the decimation above involves solving another equation, namely 

(x + l) d + x d + 1 = 0. (1.3) 

For a given d, this is usually more or less a routine task. We give the number of 
solutions for a family of decimations d. 

The cross-correlation function between two cyclically distinct m-sequences 
takes at least three values, see [2], and all known three- valued cases are covered by 
theoretical results. Previously, only three families of four- valued cross-correlation 
functions have been found. These correspond to the decimations 

A. d = 2 n / 2+1 — 1, with n = 0 (mod 4), 

B. d = (2 n / 2 + l)(2 n//4 - 1) + 2, with n = 0 (mod 4), and 

C. d = Yli=o with n = 0 (mod 4), 0 < m < n, gcd(n, m) = 1. 

The cases A. and B. are due to Niho [9] and case C. is due to Dobbertin [1]. The 
decimations in C. include the decimations in A. 

Our family of decimations includes the decimations both in A. and B., and 
in addition case C. leads to the same pair of equations. Thus all known infinite 
families of four- valued cross-correlations arise from the same equation! 



2. The Equation 

Suppose n is even, say n = 2k. We denote q = p k . In analogy with the usual 
complex conjugation we will denote 

y = y q 

for y G GF(q 2 ). The usual properties of conjugation carry over to the finite case. 
For instance, we have u + v = u + v and u + u € GF(q) for all w, v G GF(q 2 ). A 
less trivial property is presented in the following lemma. 

We define the unit circle of GF(q 2 ) to be the set 

S — [x G GF(q 2 ) : xx = l} . 

Lemma 2.1. 

(i) Let z G GF(q 2 ) \ GF(q) be fixed. Then 

(ii) Let f3 G 5 \ {±1} be fixed. Then 

s \ ( «={^ :oecf( 4 

Proof. Since u = u for u G GF(q) we have x = x~ x for x = (z + u)/(z + u). 
Furthermore, z G GF(q 2 ) \ GF(q) implies that the elements of this form are 
distinct and different from 1. This proves (i) and (ii) is equally simple. □ 




On Certain Equations over Finite Fields 



171 



We note here that both of these parameterizations have a geometric inter- 
pretation, and this is the way they were found. 

Lemma 2.2. Let a be a nonzero element in some extension of the field GF(p). If 
the equation 

x pS ~ l — a 

has a solution in GF(p k ), then it has exactly p& cd ( k ’ s ) — l solutions in the field 
GF(p k ). 

Proof. Assume #o € GF(p k ) satisfies the above equation. Then any ux o, with 
u G GF(p s ), is a solution and every solution is obtained in this way. The claim 
follows from the fact GF(p k ) f| GF(p s ) = GF(p r ), where r = gcd(A;, s). □ 

Theorem 2.3. Let n = 2k and y G GF(q 2 ) \ {0}. The equation 

x p +1 + yx pS — yx — 1 = 0 (2.1) 

has either 0, 1, 2 or p gcd ^ s ' k ^ + 1 solutions x G S. 

Proof The proof is divided into two cases. 

Case 1. Assume first that y G GF(q ), i.e., y = y. In this case x = 1 G 5 is a solution 
to (2.1). We apply the parameterization (i) of Lemma 2.1 to the equation (2.1), 
and then multiply it by ( 'z + u) pS + l (note that the coefficient of u pS+l disappears) 
to get 

(z — z + yz — yz)u p -\-(z p —z p +yz p —yz p )u = — (z p +1 —z p + 1 +yz p z — yzz p ). 

Every solution x G S\ {1} to (2.1) corresponds to a solution u G GF(2 k ) of the 
previous equation. 

If z — z + yz — yz = 0, there is nothing to prove. Otherwise we have an affine 
equation of the form 

u pS + a\u = a 2 , (2.2) 

where ai,a .2 G GF(q). Lemma 2.2 implies that the corresponding linear equation 

u pS + am = 0 (2.3) 

has either exactly one root or exactly pg cd ( fc > s ) roots in GF(q). From linear algebra 
(or the theory of linearized polynomials, see [7]) we know that the affine equation 
(2.2) either has no solutions or it has the same number of solutions as (2.3). Hence, 
in the case y G GF(q), the equation (2.1) has either 1, 2 or pS cd ( k ’ s ) + 1 solutions 
in S. 

Case 2. For the rest of the proof, we assume that y £ GF(q). If (2.1) has no solution 
in 5, we are through. Suppose now that there is such a solution. We apply the 
parameterization (ii) of Lemma 2.1 to the equation (2.1). Since y £ GF(q ), the 




172 



T. Helleseth, J. Lahtonen and P. Rosendahl 



fixed element (5 can be chosen to be one of the solutions. Multiplied by ( a + f3) p +1 
the equation (2.1) transforms to 

{P p ‘ +l +yP p ‘ -yp-\)a pS+l + (P p ‘ + yp p ’ +l - y - P)a p ‘ 

+ (P + y -yP p3+1 - P pS ) 0t 

+ (l + yP-yp pS -P pS+1 )=0. 

Here the leading coefficient is zero. We should now find the solutions in 
GF(q). 

If the coefficient of a pS above is zero, then we are through. Otherwise we have 
again an affine equation of the form 

u p +aiu = ot 2 i (2.4) 



where 07, a 2 E GF(q 2 ). To complete the proof, we may now proceed similarly as 
in the case y E GF(p k ). □ 



The binary case of this theorem is proved in [5], and in that paper only 
parameterization (i) is used. Actually, either one of the parameterizations would 
be enough in the binary case. In this more general case some difficulties occur if 
we try to use either (i) or (ii) only. 

An easy computation shows that al = ol{ in the equation (2.4). Hence we 
have in fact 07, a 2 £ GF(q) although this is not needed. 

It may seem difficult to find the number of times each possibility happens, 
e.g., how many times (2.1) has exactly one solution in S. However, in the binary 
case the equation is related to certain cross-correlation functions, and the question 
above can be answered by solving an equation of the type (1.3). We will do this 
for a more general class of decimations d after giving some background. 



3. An Application 

For basic properties of m-sequences we refer to [8] and [4]. 

From now on we assume that p = 2. Recall that the cross-correlation function 
between two binary sequences u(t) and v(t ) of the same period e is by definition 

C u , v (t) = ^(_i)»(‘)+«(‘+ T ). 

t = 0 

An important problem in sequence analysis is to determine the values and the 
number of their occurrences taken by the cross-correlation function. 

Assume now that u(t) and v(t) are m-sequences of period 2 n — 1. We may 
assume that u(t) is given by 

u(t) = tr^{ 7 ( ), 

where tr ™ denotes the trace from GF(2 n ) onto GF( 2) and 7 is a primitive element 
of GF( 2 n ). Furthermore, v(t) can be shifted cyclically in such a way that v(t) = 




On Certain Equations over Finite Fields 



173 



u(dt) for some d satisfying gcd(d, 2 n — 1) = 1. As usual, we denote the cross- 
correlation function of these sequences by Cd(r), i.e., 

C d (r) = ^(-l) tr " ( ^+^ <t+T>) . 

t = 0 

It is well known that the values (and the number of their occurrences) of Cd(r) 
depend only on d, and not on the choice of the primitive element. 

The main technique used in [9] is given by the following theorem. Again we 
assume that n = 2k. 

Theorem 3.1. Assume that the integer d satisfies 

(i) gcd(d,2 n - 1) = 1, 

(ii) d = 1 (mod 2 k — 1), and 

(iii) ed = f (mod 2 k 4- 1), 

for some f and some e for which gcd(e, 2 k + 1) = 1. Then Cd(r) assumes exactly 
the values 

— 1 + 2 k {N{y) — 1), (3.1) 

where N(y) is the number of solutions to the pair of equations 
( x 2f + yx f+e + yx f ~ e + 1 = 0 

1 x 2 ^ 1 = 1, (3 ' 2) 

and y runs through the nonzero elements of GF( 2 n ). 

The proof of Theorem 3.1 is based on the transitivity of the trace and the 
observation that every x E GF(2 n ) \ {0} can be represented uniquely as x = a(3 ^ , 
where a E GF(2 k ) \ {0} and (3 E 5. 

The assumption (i) is needed only to guarantee that the decimated sequence 
is indeed an m-sequence. Without this condition, the theorem would still be useful 
in determining cross-correlation functions or weight distributions of cyclic codes. 
Let ^ 

d= £ — -(2 2fc + 2 S+1 — 2 fc+1 — 1), (3.3) 

2 s — 1 

where it is assumed n = 2k and 2s divides k. It is straightforward to see, that d 
satisfies the conditions of Theorem 3.1 for e = 2 s — 1 and f = 2 k - 2 s . Now the 
corresponding equation is exactly the binary special case of (2.1). 

In view of (3.1), Theorem 2.3 now implies that for the d in question, Cd(r) is 
indeed four- valued, and that the cross-correlation values are — 1 — 2 fc , —1, — 1 + 2 fc , 
and — 1 + 2 fe+s . In order to find the distribution of the values (or the number of 
occurrences of each possibility in Theorem 2.1), we will use the following lemma. 

Lemma 3.2. We have 
« £f=o 2 (Ci(T) + l) = 2« 

(ii) £f=o 2 (^(r) + l) 2 = 2 2 " 

(iii) £r=o 2 (C*(T) + l) 3 = 2 2 "6, 




174 



T. Helleseth, J. Lahtonen and P. Rosendahl 



where b is the number x E GF(2 n ) such that 

( x + l) d + x d — 1. 

The equations (i) and (ii) are well known and proofs can be found, e.g., in 
[9]. The equation (iii) is proved in [2]. 

We will now find the number b for a family of decimations. 

Lemma 3.3. Let (3 , 7 E S. Then (3 + 7 E GF(2 k ) if and only if (3 = 7 or (3 = 7 ” 1 . 
We omit the simple proof. 

Theorem 3.4. Assume that d = 1 (mod 2 k — 1). If gcd(d — 1, 2 k + 1) = gcd(d + 
1, 2* + 1) = 1, then the equation 

(x + l) d = x d + l (3.4) 

has exactly 2 k solutions in GF( 2 n ). 

Proof Every x E GF(2 k ) is a solution to (3.4) since d = 1 (mod 2 k — 1). We now 
assume that x / 0 satisfies (3.4). 

The equation (3.4) implies (x + l) d = x d -f 1, and hence 

(x + l) d (x 4- l) d = ( x d + 1 )^ + 1 ), 

that is 

(xx + X + X + l) d = ( xx) d + x d + x d -f 1 . 

For a E GF(2 k ) we have a d — a, and thus 

x d + x = x d + x. 

This is equivalent to 

x d +xeGF(2 k ). (3.5) 

Representing x = a(3, where a E GF{ 2 k ) and (3 E S, gives that (3 E S satisfies 
(3 d + 0 E GF(2 k ). Lemma 3.3 implies (3 d = (3 or (3 d = /3" 1 , i.e., /^ d±1 = 1. By 
assumption, this is possible if and only if f3 = 1, and thus x E GF(2 k ). □ 

Lemma 3.5. We have gcd (d ± 1, 2 k + 1) = 1 for d in (3.3). 

Proof. Since now gcd(2 s — 1, 2 fe + 1) = 1, we have gcd(d =b 1, 2 k + 1) = gcd((2 s — 
l)(d± 1), 2 k 4 - 1). The lemma follows easily from the congruence (2 s - 1 )d = 2 k — 2 s 
(mod 2 k + 1). □ 



Finally, Cd(r) is as follows. 

Theorem 3.6. Let n = 2k, where 2s divides k, and let d — ( 2 2k -f 2 S+1 — 2 fc+1 — 
1)/(2 S — 1 ). Then the cross- correlation function Cd{r) between two m-sequences 
takes the following values: 



— 1 — 2 k occurs 



-1 

-1 + 2 * 
— 1 + 2 fc+s 




22k + s — 1 2 k +■ s — 1 

F+l 

2 2k _2 k -2 s 

2 s 

22fe + s— l_22fc_|_2 fc + s—1 
2 s — 1 
2 2k -2 k 
23s _2 S 



times 

times 

times 

times. 



occurs 




On Certain Equations over Finite Fields 



175 



Proof. Theorem 2.3 shows that Cd{r) is four-valued and gives the values. Further- 
more, Theorem 3.4 gives the number b of Lemma 3.2. Denote by Ni the number 
of times (2.1) has exactly i solutions in 5. We have a system of linear equations 

N 0 + Nx + N 2 + N 2 , +1 = 2 2k — 1 

-2 k N 0 + 2 k N 2 + 2 k+s N 2 °+i = 2 2k 
2 2k N 0 + 2 2k N 2 + 2 2k+2s N 2 s+i = 2 4fc 

-2 3k N 0 + 2 3k N 2 + 2 3k+3s N 2 s+i = 2 5fc . 

The first equation comes from the number of equations of the form (2.1), and the 
other ones are simple consequences of Lemma 3.2. Straightforward calculations 
give the claimed distribution. □ 

Remark 3.7. It is a routine matter to verify that s = 1 (resp. s = k/2) corresponds 
to the case A. (resp. B.) given in the introduction. We note here that Niho’s proof of 
B. is somewhat complicated. In fact, it is incomplete in the sense that it essentially 
depends on a result due to Welch, and this result does not seem to be published. 
An earlier simple proof of B. can be found in [3]. 

The case C. by Dobbertin [1] leads to the same equation but with the restric- 
tion gcd(s, k) = 1. The proof presented by Dobbertin is based on Niho’s technique 
but is different otherwise. Thus we have an alternative proof also in this case. It 
should be noted, that according to the computed results, there are four- valued 
cross-correlations which are not related to the equation studied in Section 2. 

Niho [9] gave tables of binary cross-correlation functions up to n = 16, 
and now all at most four- valued cross-correlation functions of binary m-sequences 
within this table belong to a known infinite family. 

Lastly we mention the well-known fact that the problem of determining the 
cross-correlation function of m-sequences is equivalent to determining the weight 
distribution of certain cyclic codes. This connection is explained in detail in [6]. 

References 

[1] H. Dobbertin: One-to-one highly nonlinear power functions on GF( 2 n ), AAECC 
Applicable Algebra in Engineering, Communication and Computing 9 (1998) 139- 
152. 

[2] T. Helleseth: Some results about the cross-correlation function between two maximal 
linear sequences , Discrete Mathematics 16 (1976), 209-232. 

[3] T. Helleseth: A note on the cross- correlation function between two binary maximal 
length linear sequences , Discrete Mathematics 23 (1978), 301-307. 

[4] T. Helleseth, P.V. Kumar: Sequences with low correlation , in Handbook of Coding 
Theory (ed. V.S. Pless, W.C. Huffman), Elsevier Science (1998), 1765-1853. 

[5] T. Helleseth, P. Rosendahl: New pairs of m-sequences with J^-level cross- correlation, 
submitted to Finite Fields and Their Applications. 

[6] I. Honkala, A. Tietavainen: Codes and number theory , in Handbook of Coding Theory 
(ed. V.S. Pless, W.C. Huffman), Elsevier Science (1998), 1141-1194. 




176 



T. Helleseth, J. Lahtonen and P. Rosendahl 



[7] R. Lidl, H. Niederreiter: Finite Fields , Encyclopedia of Mathematics and Its Appli- 
cations vol. 20, Addison- Wesley, Reading (1983). 

[8] R.J. McEliece: Finite Fields for Computer Scientists and Engineers, Kluwer Aca- 
demic Publishers, Boston (1987). 

[9] Y. Niho: Multivalued cross- correlation functions between two maximal linear recur- 
sive sequences , PhD Thesis, University of Southern California (1972). 

[10] H.M. Trachtenberg: On the cross-correlation functions of maximal linear sequences , 
PhD Thesis, University of Southern California (1970). 



Tor Helleseth 

Department of Informatics 
University of Bergen 
N-5020 Bergen, Norway 
e-mail: torhQii.uib.no 

Jyrki Lahtonen 
Department of Mathematics 
University of Turku 
SF-20014 Turku, Finland 
e-mail: lahtonen@utu.fi 

Petri Rosendahl 
Department of Mathematics 
University of Turku 
SF-20014 Turku, Finland 
e-mail: perosen@utu.fi 




Progress in Computer Science and Applied Logic, Vol. 23, 177-192 
© 2004 Birkhauser Verlag Basel/Switzerland 



A Polly Cracker System Based on Satisfiability 

Frangoise Levy-dit-Vehel and Ludovic Perret 



Abstract. This paper presents a public-key cryptosystem based on a subclass 
of the well-known satisfiability problem from propositional logic, namely the 
doubly-balanced 3-SAT problem. We describe the construction of an instance 
of our system - which is a modified Polly Cracker scheme - starting from 
such a 3-SAT formula. Then we discuss security issues: this is achieved on the 
one hand by exploring best methods to date for solving this particular prob- 
lem, and on the other hand by studying (systems of multivariate) polynomial 
equation solving algorithms in this particular setting. The main feature of our 
system is the resistance to intelligent linear algebra attacks. 

Keywords. Combinatorial- algebraic cryptosystems, systems of polynomial 
equations, 3-SAT, hard instances generation. 



1. Introduction 

Since the failure of knapsack-based cryptosystems [Od, Sh], a widely accepted opin- 
ion was that NP-complete problems were not suited for the construction of secure 
trapdoor one-way functions. In 1993, M. Fellows and N. Koblitz [FK] proposed to 
further investigate the use of those problems for designing public-key cryptosys- 
tems, and proposed a general framework, called CA-systems 1 , the main illustration 
of which was the Polly Cracker cryptosystem. In this system, the public-key is a 
set S = {pi,...,p^} of multivariate polynomials over a finite field ¥ qj and the 
secret-key is a zero a of 5. To encrypt a message M E F q , Bob chooses an ele- 
ment e# = z2i=i hiPi °f the ideal generated by the polynomials of 5, and sends 
c = es + M to Alice. Knowledge of a then allows Alice to decrypt the ciphertext 
just by evaluating it on a. 

The (public-key, secret-key) pair is derived from an instance of an NP-complete 
combinatorial 2 problem, in such a way that knowing the public-key is equivalent 
to knowing the considered instance, and that finding a secret-key from the public- 
key is equivalent to finding a solution for this particular instance. M. Fellows and 



1 For “combinatorial-algebraic” cryptosystems. 

2 In a broad sense, i.e., this includes graph theory, boolean logic, . . . 




178 



F. Levy and L. Perret 



N. Koblitz suggest several NP-complete problems for use in this context, mainly 
based on graph theory (e.g., 3- color ability, perfect codes in graphs, . . . ) but do not 
really investigate the way of generating “hard” instances of these problems with a 
fixed solution. As pointed out by R. Steinwandt et al. [GS], a naive technique to 
generate such instances yields very weak public-keys. 

Here, we follow the CA-systems line of research by proposing a public-key cryp- 
tosystem based on the well-known satisfiability problem from propositional 
logic. More precisely, we use the 3-sat problem. One advantage of using this un- 
derlying hard problem is that it ha s been extensively studied, mainly due to the 
fact that it is of interest in other research areas, such as planning or scheduling, 
see, e.g., [CMi, CMo]. 

Although proven to be NP-complete, this problem admits many “easy” instances, 
where deterministic algorithms (such as the recursive DPLL[DLL]) perform quite 
well in practice. Indeed, let n (resp. m) denote the number of variables (resp. 
clauses) of the problem, and set m = cn with c E M* + . Then, as c increases, it 
has been shown experimentally that the probability of an instance of 3-sat being 
satisfiable shifts from almost one to almost zero. The range of c over which this 
transition occurs is 3 3.003 < c < 4.598. This is known as the threshold conjecture. 
In this range, there is a value of c corresponding to a complexity peak at which 
on average half of the instances are satisfiable. The exact value of c yielding this 
peak can be numerically determined for each instance distribution. 
Non-deterministic methods have also been devised, that often give better results 
on satisfiable instances (e.g., Walksat, [SKC]), especially near the threshold region. 
They are known as local search methods. 

The hardness of this problem is tightly located in the critical range for c, and for 
(very) large values of n. Having this in mind, and also that the parameter sizes 
and generation times of our system have to be polynomial 4 , we chose to restrict 
ourselves to a particular class of the 3-SAT instances, namely the class of so-called 
doubly-balanced 3-SAT [DB], a.k.a. literal-regular 3-SAT [BS]. Formulae in this 
class have the particularity that every variable appears (almost) equally often, 
and (almost) as often negated as unnegated. Instances from this class are much 
more difficult to solve in general than random 3-SAT instances, as they are designed 
to have structural regularities, thus confusing variable selection heuristics that are 
used by most solvers (for example, DPLL-like algorithms treat the variables with 
a small number of occurrences first). 

Note that for random 3-SAT the complexity peak occurs for c « 4.25, while for 
doubly-balanced 3-SAT, it has been shown to be c « 3.5 (both values experimen- 
tally determined). 

The paper is organized as follows: in the next section, we begin by providing 
the necessary background to understand the basics of the 3-SAT problem, as well 



3 For 3-SAT; For k- SAT with higher values of k, this range is shifted. Also, the higher n is, the 
sharper the range becomes. 

4 In the size of the input of 3-SAT, namely nlg(n), denoting by lg() the base-two logarithm. 




A Polly Cracker System Based on Satisfiability 



179 



as methods for generating random instances, and doubly-balanced ones. Then 
we show how to translate this problem into a system of polynomial equations, 
in order to use it in our cryptographic setting. We exhibit the correspondence 
existing between the models of 3- SAT and the solutions of the system, and we link 
particular 3-sat formulae with reduced Grobner bases. In Section 3, we describe 
the cryptographic scheme we propose and present an original method to encrypt 
messages, the security of which is addressed in Section 4. We address carefully the 
single break attacks found by H.W. Lenstra Jr. [Ko], and show that they cannot 
be conducted in our context. We also consider the differential attack proposed 
in [SG], which is a very powerful tool to attack generic Polly Cracker systems. In 
addition, we suggest an extension of this attack. On the other hand, we investigate 
total break methods on the system. They are of two types: the first type is the 
use of 3-sat solvers to break the considered instances, from which we protected 
ourselves by carefully choosing the instances. The second type is to run algorithms 
computing (an element of) the variety of the set of polynomials involved. One of 
the best algorithm known to us - namely F 4 [Fa] - does in fact more: it computes 
a Grobner basis of the set of polynomials. For the considered sizes, it appears that 
such an algorithm is of no help. 

We end the paper by a section concerning implementation aspects. We would like 
to mention that, when investigating Polly Cracker- type systems, our intention was 
not to design a scheme that was likely to compete with the public-key systems 
in use. What we were interested in was mainly to design a new Polly Cracker 
system offering resistance to linear algebra attacks. Moreover, our approach of the 
SATISFIABILITY problem in this cryptographic setting appears quite interesting, as 
the public keys arising from this problem can be chosen strong. 

2. CNF Formulae and Systems of Polynomial Equations 

2.1. 3-sat and Instance Generation Methods 

We begin by recalling what the 3-SAT problem is. Let X = {x \, . . . , x n } be a set of 
variables and let A, V, “denote logical and, or, not respectively. A truth assignment 
for X is a function t : X i-» {True, False}. For all j, 1 < j < n, a literal Uj is either 
Xj or Xj. For a variable x 3 Gl,a literal Xj (resp. x 3 ) is true if t{x 3 ) = True (resp. 
t(xj) = False). A clause over X is the disjunction of a set of literals over X. It is 
satisfied by a truth assignment if, and only if, at least one of its literals is true under 
that assignment. A clause containing only three literals will be called a 3 -clause. 
For instance, C = x 3l V x~j 2 V x j3 , 1 < ji, j2,js < n, is a 3-clause, and is satisfied 
unless tfaj = False , t(xj 2 ) = True, t(xj 3 ) = False. A CNF-formula 5 C is the 
conjunction of arbitrarily many clauses Ci , . . . , C m , ra G N*. It is satisfiable if, and 
only if there exists some truth assignment for X that simultaneously satisfies all 
the clauses in C. Such a truth assignment is called a satisfying truth assignment, 
or a model for the formula C. If C contains only 3-clauses, then we say that C is a 

5 Conjunctive Normal Form. 




180 



F. Levy and L. Perret 



3 -CNF formula. For instance, C = AjL-^Cj where Cj — Uj Y V u j2 V Uj 3 , m 6 N*, is 
such a formula. 

In the sequel, we shall denote a CNF-formula either as a conjunction of clauses 
as above, or equivalently as a collection of clauses, the conjunction then being 
implicit. 

The 3-satisfiability problem can then be stated as follows: 
instance: a collection C = {C\, . . . , C m } of 3-clauses on X. 

QUESTION: is there a satisfying truth assignment for C ? 

The random 3-SAT problem which we referred to in the introduction is the 3-SAT 
problem in which instances are generated according to the following procedure 6 : 
The number of variables n and the number of clauses m being fixed, randomly 
select three distinct variables out of n, then negate each variable with probability 
1/2. Combine these literals in a 3-clause. Repeat this process until the desired 
number m of clauses is reached. Conjoin them to form a CNF-formula. 

The restriction of 3-SAT to balanced formulae is the one in which a formula C is such 
that, for alH, 1 < i < n, each variable X{ appears equally often 7 , i.e., in |_3m/nJ 
clauses (there are 3m positions to fill, corresponding to the m 3-clauses). But then, 
it can be that some variables appear more often negated than unnegated (or the 
converse). The doubly-balanced 3-SAT subclass is precisely the class of formulae 
that do not present this type of irregularity; namely, a formula in this class is such 
that each literal appears (almost) 3m/ (2n) times (there are 2 n possible literals). 
Such instances can be generated with the following algorithm: 

The number of variables n and the number of clauses m is being fixed. Place 
|_3m/(2n)J occurrences of each of the 2 n literals in a bag. To reach exactly 3m 
literals in the bag, add randomly some literals, not twice the same. To construct 
each clause, remove three literals on distinct variables from the bag. At some point, 
if the literals remaining in the bag concern only one or two distinct variables, then 
randomly add distinct variables in the bag, negating each of them with probability 
1/2. Keep on the construction of the clauses until the desired number is reached. 
Note that to generate a (doubly-balanced) formula admitting a particular model 
y , one simply modifies the above procedure by throwing away the 3-clauses that 
are not satisfied by y. 

2.2. Constructing a System of Polynomial Equations from 3-SAT 

We shall now explain how to translate an instance of the 3-SAT problem into a 
system of polynomial equations. A similar description already appeared in [Ba]. 
We shall denote by K[X\, the polynomial ring K[x \, . . . , x n ] over the field K. 
We choose two field values T,F € K, representing True and False respectively. 
To a 3-clause c involving the three literals Uj , Uk, ui, 1 < j, k, £ < n, one can 
associate a total degree 3 polynomial in K[X] as follows: if Uj = x 3 , then we 
replace u 3 by (xj — T); if Uj = Xj, then we replace it by {x 3 — F). Replace V by 

6 Fixed Clause Length generation. 

7 Almost: occurrences of some variables must be added if 3 m/n is not an integer. 




A Polly Cracker System Based on Satisfiability 



181 



multiplication. For instance, the polynomial 8 p c (X) G K[X] corresponding to the 
clause c = xj V XkV xe is p c (X) = (xj — T)(xk — F)(xe - T). It is then clear that 
a satisfying truth assignment of X for c corresponds to a zero of the polynomial 
p c {X). With this construction, we have: 

Theorem 2.1. A 3 -CNF formula C = A ■JLjC'i admits a model if, and only if, the 
corresponding system of polynomial equations {p\(X) = 0, . . . ,p m (X) = 0} has a 
solution over the algebraic closure of K. 

Let k G {l,...,ra}, {ii,..., I*} C {l,...,m} and {C tj } i<j<k be a set of 
clauses. We shall say that {VariCij )}i<j<k is a disjoint set if for all a, b G 
{ii, . . . , Zfc}, a ^ b, Var(C a ) fl Var(Cb) = 0. We will give now a simple con- 
nection between a set of clauses and a Grobner basis. For a detailed description of 
Grobner bases, we refer to [BW]. 

In order to prove the next proposition, we introduce a few notations, that will be 
useful throughout the paper. 

We shall denote by Term = { x ^ . . . x^ n , . . . , v n ) G N n } the set of terms in 

{ x \ , . . . , x n }. We define the total degree of a term x" 1 . . . x \ G Term as the sum 
Sr=i ^ Term(f) as the set of terms of the polynomial / G K[X] and HT(f) as 
the head term of / (with respect to some fixed order on the terms). A monomial 
at is simply a term t multiplied by a constant a G ¥ q . 

Proposition 2.2. Let C = A£L X C{ be a 3 -CNF formula, {p \, . . . ,p m } be polynomials 
of K[X] constructed from this formula as explained above, with T,F G K. 

If {Par(C^.)}i<j<fc is a disjoint set, then {Pij}i<j<k is a reduced Grobner basis 
°f (Pij)i<j<k for the degree lexicographical (deglex) order. 

Proof The fact that {Var^i^ji^j^k is a disjoint set implies that any two p,p' G 
{Pij}i<j<k have disjoint head terms. It follows, by the Buchberger’s first criterion, 
that {Pij}i<j<k is a Grobner basis of {pi 3 )\<j<k for the deglex order. Moreover, by 
construction, all these polynomials are monic. Finally, suppose that there exists 
two different indices a, b G {fy, . . . , ik} for which tb = t * HT(p a ) with t G Term 
and t b G Term(p b ), i.e., t = H xl Pa y 

It is then necessary that Var(C a )r\Var(Cb) 7^ 0, contradicting the assumption. □ 

We shall here use for K a finite field ¥ q . We ask that T and F be two non-zero 
field elements, so we set q > 3. 

3. The System 

Selecting the Public-key /Secret-key pair 

Alice chooses a finite field ¥ q with q > 3, and positive integers m and n. She also 
takes a vector y of {T, F} n at random. This is her secret-key. 

She then generates an instance C = A ^Ci of doubly-balanced 3-SAT admitting 
y as model. For this, she uses a generation method due to E. Hirsch [Hi] and 

8 Letting X stand for x \ , . . . , x n . 




182 



F. Levy and L. Perret 



called hgen2. This method follows the one described in Section 2.1, but with some 
other constraints, that aim to generate formulae with as independent clauses as 
possible. For instance, if a clause involves literals Uj , Uk and then his algorithm 
is designed such that no other clause of C involves any two of them. 

Having done this, Alice publishes the formula C, together with m, n and q (values 
T and F are also publicly known). In Section 5, we shall explain how we repre- 
sent C. Indeed, as shown in Section 2.2, it would have been equivalent - from an 
information theoretic viewpoint - to publish the m polynomials corresponding to 
these m clauses, but the “clause- representation” allows for a more compact form. 

Encryption 

The encryption phase follows the idea of a regular Polly Cracker scheme. But the 
practical realization is quite different from [FK]. We shall denote by {pi, . . . ,p m }, 
the polynomials constructed from the clauses {Ci, . . . , C m }, and by I the ideal 
generated by these polynomials. We shall now explain how to use Proposition 2.2 
to construct, in a very simple way, an element of I. In the next sections, we shall 
motivate this construction. The algorithm is the following: 



Algorithm 1 

Input: / G Fq[X], l > 2, {Ai, . . . , A;}, A* G F, with Y,\=i A. = 0[g] and D = 
{Di, . . . , D/} a set of subset indexes such that VI < i < /, {Var(Ci j )}j^ i is a 
disjoint set. 

Output: An element of the ideal I. 

For i from 1 to l do 

Compute Ni(f), the normal form of / modulo {pj} jeDi . 

End For 

Return e/ = £- =1 KNi(f). 



Theorem 3.1 (Correctness). With the inputs given in the preceding algorithm , e/ 
is an element of I. 

Proof. Note that, according to Proposition 2.2, at each step z, 1 < i < /, of the 
algorithm, (pj) je0i is a Grobner basis. Hence, Ni(f) being the normal form of / 
modulo we have that /* = Ni(f) — f reduces to 0 modulo {pjljeDi- Thus: 

VI <2<I,/iG (Pj)j€Di C (pi,...,p m ). 

We conclude the proof by noticing that, due to the choice of { Ai , . . . , A/}: 

i i i 

ei = J2 W/) = E MW) - /) = Y, e (pi, . . . ,p m ). □ 

i=l i— 1 i— 1 

For ej(X) = with almost all the as being zero, we define supp(ej) 

as the set {a E N n : a a / 0}. 





A Polly Cracker System Based on Satisfiability 



183 



To encrypt M, Bob chooses (3 = (/3i, . . . , f3 n ) G supp(ei ), (3 ± (0, . . . , 0), computes 
the ciphertext defined by: 

c(X) = e f (X) + Mx 01 . . . = e,(X) + Map G ¥ q [X] 

and sends to Alice ( c(X),(3 ). 

Decryption 

Upon receiving (c(X),/3), Alice evaluates: 

c{y ) ei(y) + My 0 _ 
y P y P 

and recovers 9 the plaintext. 

4. Security Issues 

4.1. Total Break 

It is clear that the crucial point in using the 3- SAT problem in a Polly Cracker 
system lies in the method chosen for generating hard satisfiable instances. While 
it remains an open problem to generate hard solved instances [ILL], the doubly- 
balanced 3- SAT formulae are among the hardest 3- SAT instances to solve by cur- 
rently known methods: this is due to the fact that they are not completely random, 
as instances from the random 3-SAT problem can be, nor completely “structured” 
(this terminology refers to 3-SAT instances arising from the modelling of real-life 
phenomena occurring in, e.g., planning or scheduling). Thus, efficient algorithms 
on random formulae such as Unit Walk or OKsolver [Sa] will be defeated by the 
regularity of those formulae, whereas algorithms that perform well on structured 
instances - like Zchaff or Sato [Sa] - will then behave poorly, those formulae be- 
ing too “random” to handle. The ones chosen by us for the construction of our 
public-keys come from the hgen2 generator of E. Hirsch [Hi]. The formulae of this 
family have been confronted, in the SAT’ 2002 and in the SAT 5 2003 competitions, to 
all the best solvers (see again [Sa]). The result is that formulae generated by this 
algorithm have proven to be the ones that best resist to known solvers. Besides, 
instances from this generation method have won the smallest (in terms of n) sat- 
isfiable unsolved instance challenge of this competition: the smallest such instance 
had parameters n = 500 and m = 1750. These formulae, available in a benchmark 
[Hi], still remains unsolved. For this system, we recommend 700 < n < 900 and 
m = 3.5 n, which makes instances of these sizes far beyond reach of the current 
best solvers. 

On the other hand, the security of our scheme relies on the difficulty of finding a 
solution of a system of m polynomial equations of maximal degree 3 in n variables 
over a finite field ¥ q . In other words, if I denotes the ideal generated by these 
polynomials, the problem is to find an element of the variety Vf- (I). This problem 
can be solved by means of computing a Grobner basis of /. In this case, this gives 



9 T and F being two non-zero field elements, it follows that y@ / 0 for any choice of (3. 




184 



F. Levy and L. Perret 



in fact all the elements of Vj — (/). The complexity of computing a Grobner basis 
of a system of polynomials - although theoretically doubly exponential in the 
number of variables - depends in practice very much on the nature of the system, 
and of the algorithm used. We have run the F 4 algorithm [Fa] on instances of our 
scheme via the web interface 10 of Fgb. Practically, yet for polynomial systems 
corresponding to n — 100 variables and m = 350 clauses, such an algorithm fails 
computing a Grobner basis: indeed, we have noticed that, after some iterations, 
the algorithm cannot terminate, due to the handling of huge matrices (typically 
square matrices of a hundred thousand entries). Thus, it appears that the sizes we 
consider are far out of reach of this type of algorithms. 

Chosen ciphertext attack 

Recently, a chosen-ciphertext attack on Polly Cracker system was designed [SG], 
whereby it is possible to retrieve the secret key by n queries to a decryption 
oracle. It is well known that homomorphic cryptosystems are vulnerable to this 
type of attacks. For cryptosystems over the integers (e.g., RSA), padding schemes 
like REACT [OP] address the problem. For polynomial-based cryptosystems, it 
remains an open problem to adapt those paddings, especially how to represent 
plaintexts in order to perform operations like hashing or “xoring” on them. 

4.2. Single Break 

The second approach to cryptanalysis looks for weaknesses in Bob’s construction of 
the ciphertext rather than in Alice’s construction of the public-key. We recall that 
this attack, as opposed to the total break one, consists of recovering the cleartext 
from a particular ciphertext, but does not recover a secret key, thus in principle 
not compromising other uses of the system. 

Linear algebra attack 

The method is as follows. Since e/ e /, there exists {/i;}i<;<m in ¥ q [X], such that: 

m 

e i = Y] hjpi . 
i— 1 

We call these polynomials the decomposition of the polynomial e/ under I . More- 
over: 

c = ei + Mx^, 

we then have the following equation: 

m 

c = hiPi , except for the term x & . 

2=1 

We can solve this equation by regarding the coefficients of h{S as unknowns and get 
linear equations by identifying the coefficients of the terms of c with the coefficients 
of the terms of Yl'iLi ^iPi (except for the term x&). Due to the huge number of 
unknowns in the linear system, this attack is in general intractable [Ko]. 



10 http:/ /calfor. Iip6.fr/ jcf/Software/Fgb/index.html 




A Polly Cracker System Based on Satisfiability 



185 



4.2.1. Intelligent Linear Algebra Attack. In order to decrease the number of un- 
knowns, H.W. Lenstra Jr. [Ko] proposed a improvement of this method 11 . Let: 

H(c) = {t e Term : 3t p E U ^L 1 Term(pi), 3t c E Term(c) such that t c = tt p }. 

Roughly speaking, H(c) denotes the set of terms that Bob can potentially use to 
construct the given ciphertext c. If: 

U ™ = iTerm(hi) C H{c) (Cl), 

i.e., for all i, every term of hi divides at least one term of the ciphertext c, then, it 
is possible to recover e/ by solving a system of linear equations (constructed with 
the method described above) involving only #H(c) unknowns. 

In order to avoid this attack, Koblitz in [Ko ]{Ch. 5) proposed a clever construction 
of the ciphertext for which the condition Cl is not achieved, i.e., there exists at 
least one term t E erm{hi) which does not divide any of the terms of the 

ciphertext. 

With carefully chosen parameters of the system, we now show that our construction 
is resistant to this attack (corollary 4.3). For this, we need a couple of intermediate 
results: 

Theorem 4.1. Let {p \,. . . ,p m } be the polynomials of the public-key. We shall de- 
note by: 

I C {1, . . . ra}, {pj}jei a subset of the public-key polynomials corresponding to a 
disjoint set of clauses, 
f = ax a with (a, a) E F* x N n , 

N(f) the normal form of f modulo {pj)jei w.r.t. the degree lexicographic order, 

D the set of terms of the decomposition of N(f) — f under {pj)j £ j. 

If k = \I\ > 3 and if x a is a multiple o/fllLi x i then there exists at least one term 
t E D of total degree strictly larger than any term of N(f). 

Proof We shall give a constructive proof of this theorem. First, we outline the 
different steps realized during the reduction process, for a more detailed description 
of this process, we refer to [BW]. 

N (1) (/) = / — 0(l)t(l)P(l), 

N {2 \f) = N^(f)-a {2) t {2) p {2 ) = /-Ep=i a (p)*(p)P(p)> 

7V (i) (/) = NV-V(f)-a {l) t { i )P{l) = f -T, l p=i a (p)hp)P{p)’ 

where N^ l \f) is the Z-th reduction of / modulo {pj}jeh P(i ) a polynomial of 
{pj}jei used at the Z-th step of the reduction process. The term and the 
constant are chosen in such a way as to eliminate from Af^ -1 )(/) a term t 
multiple of the head term of p(/), and more precisely, we have t = t^HT(p^) and 

a(/) = ^g o) with Coef f{t,N^ l ~ 1 ^) the coefficient of t in A'^ -1 ^/). We 



11 In fact, we present here an adaptation of this attack to our scheme. 




186 



F. Levy and L. Perret 



shall say that is the /- th term of the decomposition, chosen by the reduction 
algorithm, of N(f) — f under (pj)j e i. We define l as the minimum index for which 
there does not exist a term in divisible by one of the head terms of 

{Pj}jei, meaning that the reduction process ends after step l is performed. Hence: 

T 

N(f) = N^U) = f -^(p)Hp)P(p)- 

P= 1 

In the sequel, we suppose that the reduction process is performed with respect to 
the deglex order. 

We prove that at least t(iy the first term of the decomposition of N(f) — f (under 
(Pj)jei) cannot be recovered, with the intelligent linear algebra attack described 
previously, from the terms of N(f). By showing that all the terms of N(f) are of 
total degree strictly smaller than the total degree of For this, we will show 
that all the terms generated during the reduction process of /, of total degree 
equal or larger than the total degree of t(iy are cancelled. Remark that due to the 
regular shape of the polynomials of the public-key and the particular form of /, 
we can give the total degree of the terms occurring at each step of this process. 
First step 

At the first step, is chosen in order to remove multiples of HT(p^y) from the 
term x a . If we denote by d the total degree of x Q , one sees at once that the total 
degree of is equal to d — 3. Since x a is divisible by the product of n distinct 

variables, is divisible by the product of at least n — 3 distinct variables. Let 

Xi , Xj and Xk be variables such that XiXjXkt( i) = x a . We have: 

Term(N (l \f)) = {t^xtx j 5 ^(1) 5 t(i)XjX k > £(1 )Xi, t^yXj , £(i) }• 

The terms of N^(f) of total degree d - 1 (resp. d — 2 and d - 3) are divisible by 
the product of at least n — 1 (resp. n - 2 and n — 3) distinct variables. Moreover 
k > 3 and the polynomials {pj}j£i are constructed from a disjoint set of clauses, 
therefore all the terms of Term{N^ l \f)) are divisible by at least one of the head 
terms of {pj}j e j. Since the reduction process is confluent, we can suppose without 
loss of generality, that the algorithm first eliminates all the terms of total degree 
d — 1 then those of total degree d — 2 and finally the terms of total degree d — 3. 
Total degree d— 1 

In order to cancel the terms of N^(f) of total degree d— 1, the algorithm chooses 
terms of total degree d — 4. Since all the terms of total degree d— 1 are divisible by 
the product of at least n— 1 distinct variables, the terms chosen to cancel them are 
divisible by the product of at least n — 4 distinct variables. The terms generated 
during this step are of total degree d — 2 (resp. d — 3 and d — 4) and are divisible 
by the product of at least n — 2 (resp. n — 3 and n — 4) distinct variables. 

Total degree d — 2 

This step is slightly different from the two steps above since the terms of total 
degree d — 2 come from the elimination of the terms of total degree d and d — 1 . 




A Polly Cracker System Based on Satisfiability 



187 



But all the terms of total degree d — 2 computed during the previous steps are 
divisible by the product of at least n — 2 distinct variables. Hence, all these terms 
are eliminated and lead to the generation of terms of total degree d — 3 (resp. d — 4 
and d — 5) which are divisible by the product of at least n — 3 (resp. n — 4 and 
n — 5) distinct variables. 

Total degree d — 3 

The terms of total degree d — 3 come from the elimination of the terms of total 
degree d, d— 1 and d— 2. We have shown that up to this point, all the terms of total 
degree d — 3 generated during the previous steps are divisible by the product of at 
least n — 3 distinct variables. Hence, all these terms are also eliminated. Remark 
that after this step, no term of total degree d — 3 is generated by the algorithm. 
Hence, in this setting, the terms of the polynomial N(f) are of total degree strictly 
smaller than d — 3. □ 

More generally, we have: 

Corollary 4.2. Let {pi, . . . ,p m } be the polynomials of the public-key , I C {1, . . . m), 
{pj}jei be a subset of the public-key polynomials corresponding to a disjoint set of 
clauses and x a be a term of total degree d. 

We define f'(X) = ax a + g{X) E F g [X] with (a, a) E F* x N n such that all the 
terms of the polynomial g E F g [X] are of total degree strictly smaller than d — 3. 
If k = |/| > 3 and if x a is a multiple of n™=i x i then there exists at least one 
term t in the set of terms of the decomposition of N(f') — f' under ( Pj)jei of total 
degree strictly larger than any term of N(f'). 

Proof The proof is similar to the one given above. Since the reduction process is 
confluent, we can suppose without loss of generality, that it begins by cancelling 
x a and due to the particular choice of the terms of g, one sees at once that the 
total degree of the first term of the decomposition of N(f') — f (under (pj)jei) 
is equal to d - 3. Moreover, all the terms generated during the reduction process 
of of total degree equal or larger than d — 3, are cancelled. Hence, the terms of 
N(f f ) are of total degree strictly less than d — 3. □ 

This result is very interesting since in our context, the ciphertext is a linear 
combination of normal forms. Finally, we have the following security result: 

Corollary 4.3. Let {pi, . . . ,p m } be the polynomials of the public-key and x a be a 
term of total degree d. We set d < q, we define f'(X) = ax a + g(X) € F g [X] with 
(a, a) E F* x N n and such that all the terms of the polynomial g E F g pT] are of 
total degree strictly smaller than d — 3. We also set: 

- I > 2, 

- {Ai, . . . , A/}, Xi E F q such that Yj \= i ^ = 0[g] and, 

- 2) = {Di, . . . ,ty} a set of subset indexes such that: 

VI < i < /, {Pij}jedi is constructed from a disjoint set of clauses, 

VI < i < l, di is a set of indexes of cardinality 12 3 < \di\ < |_§ J • 



12 Lf J is the maximum cardinality of a disjoint set 




188 



F. Levy and L. Perret 



Ifej is an element of (pi, . . . ,p m ) computed by Algorithm 1 with these parameters, 
then we have: 

D % H( ei ), 

D being the set of terms of the decomposition of e/ under (pi)i<i<m- 



Proof. Recall that ej is a linear combination of normal forms. If we denote by 
the decomposition of 7V*(/) — / under (pj)je o i? we have: 



i i 

ei = Y j \ l N l (f) = -'£^Y. h< J l) Pr 

2= 1 2=1 jGDj 



Moreover, with the parameters given in corollary 4.3 and according to corollary 
4.2, for allz, 1 < i < l all the terms of the normal forms Ni(f') modulo {pj}je?>i 
computed by Algorithm 1 are of total degree strictly smaller than d- 3. Hence, all 
the terms of ej are of total degree strictly smaller than d — 3. Therefore, the terms 
t of total degree d — 3 of the polynomials cannot have a decomposition 

of the form: 

t ei — t't , with t ei £ Term(ei ) and t' £ Term , 



since the terms of e/ are of total degree strictly smaller than d — 3. 

Hence, we get that: 

D % H(ej). 

Finally, all ciphertexts generated with such an element ej are resistant to the 
intelligent linear algebra attack. □ 



4.2.2. Differential Attack. Hofheinz and Steinwandt propose in [HS] , a method to 
enhance the feasibility of the intelligent linear algebra attack previously described. 
In particular, their attack permits to recover “hidden monomials” in the Koblitz’s 
graph perfect code instance of Polly-Cracker [Ko](C7*. 5). We detail here the ideas 
of this attack. 

For p = o, a x a , we denote by |c| the number of monomials of c and we also 

set: 

A(p) = {—x^~ u : x ** >- x y ,a^ • a v ^ 0}, 

Q>i/ 

>- denoting here the lexicographic order on the terms. 

Suppose that for some i,l < i < m, there exists a “characteristic difference” Si, 
i.e., Si = , with a u .x fXi , a v x Ui monomials in pi and such that: 

a Ui h*i i 

Si £ A (pi) \ ^ U j^i A(pj)^J. 

Suppose in addition that there exists a monomial a m x m in hi such that x m x and 
x m x Ui do not occur among the monomials of c — a m x m qj . If for this “characteristic 
difference” , an adversary can find monomials mi, m 2 in the ciphertext with x |mi 
and mi/ m 2 being equal to Si , then we can identify a potential monomial mu of hi 




A Polly Cracker System Based on Satisfiability 



189 



The adversary can not be sure about the correctness of his guess (i.e., if rrih is really 
a monomial of hi). But, he can check it by computing the number of monomials 
in the simplified ciphertext d = c — rrihPi . Indeed, if the number of monomials in 
d is smaller than in c, it is then very likely that th is a monomial of hi. Notice 
that c and d encrypt the same plaintext. 

An adversary repeats this simplification process of the ciphertext for each “char- 
acteristic difference” in the set A {pi)\ A (pj)^ and for alH, 1 < i < m. If at 

some point of this simplification process d is a monomial of the form apx@ , then 
the encrypted plaintext has been recovered successfully. Otherwise, he can try to 
perform an intelligent linear algebra attack on the simplified ciphertext. 
Subtracting a polynomial of the ciphertext can reveal “hidden monomials” . Indeed, 
the fact that a monomial in hj is hidden in the ciphertext c implies that for 
alH, 1 < i < m there exist two monomials ra^ in hi and m Pi in pi such that: 

m 

m hj pj + '^2™ hi m Pi = 0 . 
i= 1 

Therefore, if one can find a monomial ra^ E {m/ ll , . . . , }, then we know that 

the simplified polynomial d — c — m^pi contains a monomial of the form , 

m Pj being a monomial of pj. Therefore, the monomial which was hidden in the 
ciphertext c is no longer hidden in the simplified ciphertext d. 

We also would like to emphasize that it is not clear that the sets {A (pi) \ ^ U j^i 

A(Pj))}i<*<m always contain enough characteristic differences to recover all the 
“hidden monomials” . 

Following these remarks, we propose an improvement of the differential attack. In 
particular, we no longer consider characteristic differences. Given a ciphertext c, 
we first compute - for a monomial rrii occurring in a decomposition of the form 
m c = rriim Pi , with m c being a monomial of c and for some monomial m Pi of pi - 
the polynomial d = c — rriiPi. This polynomial can validate the choice of the guess 
(we don’t know if rrii is really a monomial of hi). Indeed, if \d\ = |c| — |m^|, 
then this can be taken as evidence that rrii is a monomial of hi. If this equality 
on the number of monomials is not true, the polynomial d can also be useful 
to reveal hidden monomials: if there exists a monomial m' in d which is not a 
monomial of c, and which occurs in a decomposition of the form m c > — nn!-m p . , 
for some monomial m Pj of pj (indeed, we then have ra c / = m'-m Pj = rriim Pi ) then, 
in addition to the fact that rrii is probably a monomial of h*, it is also very likely 
that m'j was a monomial of hj that was hidden in the ciphertext c. In all other 
cases, rrii is not a monomial of hi , and we then set d = c. 

At the second step, we select a monomial rrik rrii having a decomposition of the 
form m c t = rrikm Pk , with m c > a monomial of d and for some monomial m Pk of pk- 
We compute c" — d — rrikPk and we verify as previously whether rrik is a correct 
guess. We iterate this process while the simplified ciphertext is not a monomial 
of the form apxP (when it equals then is the plaintext corresponding to 




190 



F. Levy and L. Perret 



c, according to our encryption procedure). Notice that even if there are hidden 
monomials in the ciphertext, it is very likely that these monomials can be guessed 
by considering simplified ciphertext. 

As presented here, the attack of [HS] and the improvement we have described 
above appear to be quite generic, and thus apply to our system too. 



5. Practical Considerations 

The generation of the set C of clauses has been performed using the algorithm 
hgen2. Apart from that, the complete implementation of instances of our system 
has been done using the MAGMA symbolic language - which we found best suited 
for the manipulation of multivariate polynomials (multiplication, evaluation on a 
vector of F™, computation of the number of terms . . . ) - with interfaces in C. 

The public-key consists of m 3-clauses in the variables Xi, 1 < i < n. It can thus 
be stored using 3mlg(n) bits, that is 0(nlg(n)) bits with m = cn. 

The secret-key is n bits long, as we can identify T with 1 and F with 0 for its 
storage. 

In practice, we choose a large d (e.g., d ~ 200), and q roughly of the same size as 
d, with q > d. 

Our construction of the ciphertext presents some practical advantages. First it 
allows to construct a relatively short ciphertext (in comparison with a regular 
Polly Cracker scheme) in a quite efficient way. We can control the size of the 
ciphertext with the parameters { Ai , . . . , A/} of Algorithm 1 (by setting A; to zero 
when having reached a certain size). Moreover, increasing the size of the public- 
key does not increase the size of the ciphertext, and hence does not degrade the 
performance of the system. 

To have a more precise idea of the characteristics of this system, we give an example 
of real-time implementations. For n = 700 and m = 2450, we obtain: 4.3s to 
generate a public-key of size 6.9KBytes, an encryption time of 3.22s, 1527 terms 
in the ciphertext and a decryption time of 0.13s. 



6. Concluding Remarks 

We have presented a cryptographic scheme of Polly-cr acker type, the underlying 
problem of which is based on a subclass of the family of SATISFIABILITY problems. 
We have proposed a specific method to construct the ciphertext. We have examined 
its security on the one hand by considering single break attacks, and on the other 
hand by exploring the best known methods to date to attack the hard problem. 
Concerning single break attacks, the results obtained are quite interesting because 
resistance to intelligent linear algebra attacks has always been a concern for Polly 
Cracker type schemes. On the other hand, differential attack and our extension 
of it seem hard to defeat, as they are a generic tool to handle all Polly Cracker- 
like ciphertext constructions. Finally, we believe that our approach - namely the 




A Polly Cracker System Based on Satisfiability 



191 



investigation of sharp methods from propositional logic and the setting of results 
in a cryptographic context - is quite new. 

Acknowledgments 

The authors would like to thank Laurent Simon, Olivier Bayeux and Yacine 
Boufkhad for very helpful discussions. 



References 

[BS] R.J. Bayardo Jr., R. Schrag. Using CSP look-back techniques to solve exception- 
ally hard SAT instances. Proceedings of 2nd Int. conference on Principles and 
Practice of constraint Programming, 1996, pp. 46-60. 

[Ba] D. Bayer. The division algorithm and the Hilbert scheme. PhD. Thesis, Harvard 
University, Cambridge, Massachussets, 1982. 

[BW] T. Becker and V. Weispfenning. Grobner Bases, A Computational Approach to 
Commutative Algebra. In cooperation with Heinz Kredel. Graduate Texts in Math- 
ematics, 141. Springer- Verlag, New York, 1993. 

[CMo] S. Cocco, R. Monasson. Statistical physics analysis of the computational complexity 
of solving random satisfiability problems using backtrack algorithms. The European 
Physical Journal B 22, 2001, pp. 505-531. 

[CMi] S.A. Cook. D.G. Mitchell. Finding hard instances of the satisfiability problem: a 
survey. DIMACS Series in discrete mathematics and theoretical computer science, 
1997. 

[DLL] M. Davis, G. Logemann, D. Loveland. A machine program for theorem proving. 
Communications of the ACM, 5, 1962, pp. 394-397. 

[DB] O. Dubois, Y. Boufkhad. From very hard doubly balanced SAT formulae to easy 
unbalanced SAT formulae, variations of the satisfiability threshold. Proceedings 
of the DIMACS workshop on the satisfiability problem: theory and applications, 
March 1996. 

[Fa] J.-C. Faugere. A new efficient algorithm for computing Grobner basis: F±. Journal 
of pure and applied algebra, vol. 139, 1999, pp. 61-68. 

[FK] M. Fellows, N. Koblitz. Combinatorial cryptosystems galore ! Proceedings of the 
second international conference on “Finite Fields: theory, applications and algo- 
rithms”, Las Vegas 1993, Contemporary Mathematics, vol. 168, 1994, pp. 51-61. 

[GS] W. Geiselmann, R. Steinwandt. Some cracks in Polly Cracker. Europaisches In- 
stitut fur Systemsicherheit, Universitat Karlsruhe, Tech. Report 01/01, 2001. 

[Hi] E. Hirsch. http://logic.pdmi.ras.ru/~hirsch/ 

[HS] D. Hofheinz and R. Steinwandt. A “ Differential” Attack on Polly Cracker. Pro- 
ceedings of 2002 IEEE International Symposium on Information Theory ISIT 2002, 
extended abstract, p. 211, 2002. 

[ILL] R. Impagliazzo, L. Levin, M. Luby. Pseudo-random number generation from one- 
way functions. Proceedings of 21st STOC, 1989, pp. 12-24. 

[Ko] N. Koblitz. Algebraic aspects of cryptograhy. Algorithms and Computation in 
Mathematics, 3. Springer- Verlag 1998. 




192 



F. Levy and L. Perret 



[Le] L. Van Ly. Polly Two - a public-key cryptosystem based on Polly Cracker. These 
de l’universite de Bochum, Faculte de Mathematiques, Decembre 2002. 

[Od] A. Odlyzko. The rise and fall of knapsack cryptosystems. Cryptology and compu- 
tational number theory, Proceedings of Symposium on Applied Mathematics 42, 
AMS 1990, pp. 75-88. 

[OP] T. Okamoto, D. Pointcheval. REACT: Rapid Enhanced- Security Asymmetric 
Cryptosystem Transform. CT-RSA 2001: 159-175 

[Sa] http://www.satlive.org/SATCompetition 

[SKC] B. Selman, H. Kautz, B. Cohen. Noise strategies for improving local search. Pro- 
ceedings of AAAI-94, 1994, pp. 337-343. 

[Sh] A. Shamir. A polynomial-time algorithm for breaking the basic Merkle- Heilman 
cryptosystem. IEEE Transactions on Information Theory IT-30, 1984, pp. 699- 
704. 

[SG] R. Steinwandt and W. Geiselmann. Cryptanalysis of Polly Cracker. IEEE Trans- 
actions on Information Theory 48(11): 2990-2991, 2002. 



Frangoise Levy-dit-Vehel and Ludovic Perret 
ENSTA 

32 boulevard Victor 
F-75739 Paris cedex 15, France 
e-mail: levy@ensta.fr 
e-mail: lperret@ensta.fr 




Progress in Computer Science and Applied Logic, Vol. 23, 193-208 
© 2004 Birkhauser Verlag Basel/Switzerland 



Combinatorially Designed LDPC Codes Using 
Zech Logarithms and Congruential Sequences 

Jing Li 

Abstract. We investigate a systematic construction of regular low density 
parity check (LDPC) codes based on ( / yp 1 ~ 1 y p 1 , p, 7, {0, 1}) combinatorial 
designs. The proposed (7, p)-regular LDPC ensemble has rate (1 — ^) 7 , girth 
>2 7+1 , and exists for all 7 >2. The codes are a subset of Gallager’s random 
ensemble, but contains a good combination of structure and (pseudo) random- 
ness. In particular, the simple case of 7 = 2 results in a class of codes that 
are high-rate, systematic, quasi-cyclic, linear-time encodable and decodable, 
and free of length-4 and length-6 cycles. Analysis on distance spectrum shows 
that they are better than the Gallager ensemble of the same parameters. 
Simulation of the proposed codes on intersymbol interference channels show 
that they perform comparably to random LDPC codes. Unlike random codes, 
the proposed structured LDPC codes can lend themselves to a low-complexity 
implementation for high-speed applications. 

Keywords. Codes on graphs, Combinatorial design, Low density parity check 
(LDPC) codes, Inter-symbol interference (ISI). 



1. Introduction 

Considerable work has been done recently about the design and analysis of low den- 
sity parity check (LDPC) codes. The original LDPC codes proposed by Gallager 
use random matrices [1]. Although research work indicates that randomness is im- 
portant for capacity- approaching performance, codes with structure and regularity 
are preferred for ease in implementation. In addition to the random construction 
of LDPC codes like bit filling and/or optimization of degree profiles using density 
evolution, systematic constructions are also being proposed, which include the ap- 
proaches from combinatorial designs [4, 7, 5], finite geometries [6], congruential 

This project was supported in part by a grant from Seagate Technology and a grant from the Com- 
monwealth of Pennsylvania, Department of Community and Economic Development, through the 
Pennsylvania Infrastructure Technology Alliance (PITA). 




194 



Jing Li 



sequences [8], Ramanujan graph [9] and lattice designs [7]. It has been shown that 
in some cases (especially for short lengths and/or high rates, like those used for 
digital recording systems), structured LDPC codes are a better choice than ran- 
dom LDPC codes with comparable performance, less memory requirement, and 
more implementable structure [4, 6]. 

In general, an LDPC code is either represented using a parity check matrix H 
or its corresponding Tanner graph. Major parameters for an LDPC code includes 
the column weight 7 and the row weight p in H matrix, and the girth (the length 
of the shortest cycle) in the Tanner graph. An LDPC code is said to be (7, p)- 
regular if all columns in H have weight 7 and all rows have weight p. The girth 
is important to LDPC codes because the existing decoder is an iterative message- 
passing decoder whose efficiency is sensitive to short cycles. 

In this work, we investigate a class of structured LDPC codes from (7 p 7-1 , p 7 , 
p, 7, {0, 1}) combinatorial design (7 >2) [5]. In addition to their regular and thus 
easily-implementable structure, a key merit of considering combinatorial designs is 
that length-4 cycles can be systematically avoided [2, 4, 7, 5]. In fact, for the specific 
design proposed here, the girth is at least 2 7+1 , i.e., length-6 cycles are also sys- 
tematically avoided. The proposed LDPC codes are a subset of Gallager’s random 
ensemble, but contains a good combination of structure and pseudo-randomness. 
Zech logarithm in Galois field (GF) and congruential sequences are used to facili- 
tate the implementation. In particular, the simple case of 7 = 2 results in a class 
of systematic, quasi-cyclic, high-rate (2, p) -regular codes, which are linear time 
encodable and linear time decodable. Computation of distance spectrum reveals 
that they are (slightly) better than the Gallager ensemble of the same parameters. 

In [11] it is shown that the thresholds of regular Gallager codes over the dicode 
channel approaches the i.i.d capacity 1 of the channel at high rates. This means that 
high-rate regular LDPC codes are asymptotically optimal for dicode channels. We 
expect it to be true for a general intersymbol interference (ISI) channel also. The 
later part of the paper investigates the application of the proposed LDPC codes 
on partial response maximum likelihood (PRML) models that are used in digital 
recording systems. We show that the proposed structured LDPC codes perform as 
well as random LDPC codes, yet with more implementable structure. 



2. Preliminaries 

Definition 2.1. (Combinatorial Design) 

[1] A combinatorial design is an arrangement of a set of m points into n subsets, 
called blocks , which satisfy certain regularity constraints. 

[2] The incidence matrix of a combinatorial design gives the (0,l)-matrix (of 
dimensionality n x m) which ha s a row for each point v and a column for 
each block Z?, and (v,B) = 1 iff point v is incident with block B. 



x We use capacity to loosely denote the information rate. 




Combinatorially Designed LDPC Codes 



195 



[3] The covalency \ Vl , V2 of two points v\ and v 2 is the number of blocks that 
contain both of them. 

[4] A design is said to be regular if the number of points contained in each block 
(denoted as 7) is the same for every block and the number of blocks each 
point is incident with (denoted as p) is the same for every point. 

[5] A design is said to be balanced if the covalency A Vl)V2 of the point pair (77 , V2) 
are the same for all pairs. A regular and balanced design can be denoted as 
a (m, n, p, 7, A)-design, where mp = 727. 

It follows from the above definitions that a combinatorial design with favor- 
able constraints can define a binary LDPC code. The transpose of the incidence 
matrix can be used as the parity check matrix H , where points and blocks in the 
combinatorial design correspond to rows and columns in the H matrix. The H 
matrix has m rows, n columns (codeword length), with row weight p and column 
weight 7. The code rate is given by R— 1— ran ^ H ^ < 1— ^ (all rows in H matrix may 
not be independent). Further, covalency A < 2 guarantees that the corresponding 
Tanner graph is free of length-4 cycles. 

An example is given in Fig. 1. m = 8 points are grouped in n = 16 blocks 
with each point incident with 4 blocks and each block containing 2 points, where 
B\ — (i;i, V2), B 2 = (ui, ^4), . . . , B 16 = (vj, vg ). Fig. 1(a) shows the combinatorial 
design (where a line connecting 2 dots is used to denote a block containing two 
points) and (b) the corresponding H matrix. The design has covalency A = {0, 1} 
for all pairs of points and, hence, is free of length-4 cycles. In fact, the code shown 
here is also free of length-6 cycles. 

Some popular classes of combinatorial designs that have already been stud- 
ied for generating LDPC codes are Steiner systems or (ra, n, p,7, l)-designs [2], 
Kirkman triple systems (KTS) or (ra, n, p, 3, {0, l})-designs (which are resolvable 
Steiner triple systems (STS)) [4, 7], and balanced incomplete block designs (BIBD) 
[12, 13]. Others designs from lattice [7] and Ramanujan graphs [9] are also pro- 
posed. These systematically-designed LDPC codes share the same desirable prop- 
erties like simplicity in construction and regularity in code structure. Some of these 
codes were shown to perform within 1 dB from the capacity on AWGN channels, 
and other have been evaluated for use in magnetic recording channels. Below we 
present a new design which results in a class of (7, p)-regular LDPC codes of rate 

3. ( 7 p 7_1 , p 7 , p, 7 , {0, 1})-Designed LDPC Ensemble 

3.1. (7P 7_1 ,p 7 ,p,7, {0, 1})-Design 

Consider 7 p 7_1 points and p' blocks. For ease of proposition, we label blocks 
with 7-tuple subscripts, i.e., B {xuX2 ^^ x _ y) , where x 1 ,x 2 ,...,x 1 € {0,1,. . . ,p-l}. 
Two blocks are said to be in the same plane if at least (7 — 2) coefficients in the 
7-tuple subscript are the same. For each direction along the axis of Xi (call it 
“principal” direction), p 7-2 parallel planes (call them “principal” planes) can be 




196 



Jing Li 



(a) 



vl v2 




(c) 



(b) 



vl 


1 


1 1 


1 


0 


00 


0 


0 


00 


0 


0 


0 


0 


0 


v3 


0 


00 


0 


1 


1 1 


1 


0 


00 


0 


0 


0 


0 


0 


v5 


0 


00 


0 


0 


00 


0 


1 


1 1 


1 


0 


0 


0 


0 


v7 


0 


00 


0 


0 


00 


0 


0 


00 


0 


1 


1 


1 


1 


v2 


1 


00 


0 


1 


00 


0 


1 


00 


0 


1 


0 


0 


0 


v4 


0 


1 0 


0 


0 


1 0 


0 


0 


1 0 


0 


0 


1 


0 


0 


v6 


0 


0 1 


0 


0 


0 1 


0 


0 


0 1 


0 


0 


0 


1 


0 


v8 


0 


00 


1 


0 


00 


1 


0 


00 


1 


0 


0 


0 


1 




(c) 



vi i oooooo: 1 i i oooooo 

v2 1 10000 0l 000110000 
v3 01 1000 0; 000001 100 
v4 001 100 o; 100000010 
v5 0001 10 o: 000100001 
v6 00001 10'. 010001000 
v7 000001 1 0 0 0 0 10 0 10 

v8ioonnoo rnn i non i.n.i 



1/ (l+D) 



Figure 1 . (a) (ra, n, p, 2, {0, 1}) Combinatorial design, where p = 
4, m = 2p = 8 , and n — p 1 — 16; (b) Resulting H matrix for an 
LDPC code with code length n — 16; (c) A form of linear time 
encodable LDPC codes; (d) A form of turbo product codes. 



selected each containing p 2 blocks (and collectively covering all blocks). In each 
principal plane, the p 2 blocks can be evenly divided into p discrete “bundles” ac- 
cording to a predefined “bundle-rule” (which will be discussed later). Hence, there 
are altogether 7 p 7-2 principal planes and 7 p 7_1 bundles where no two bundles 
contain a same pair of blocks. Each bundle then uniquely determines a point, in 
other words, a point is incident only with blocks in the same bundle. We thus have 
7 p 7-1 well-defined points and p 7 well-defined blocks, where each point is incident 
with 7 blocks and each block p points. Further, the overlap of any two blocks is 
at the most one point. This results in a ( 7 p 7-1 , p 7 , p, 7 , {0, l})-design. Fig. 1 gives 
an example of a (8, 16, 4, 2, { 0 , l} 0 -design. 

Lemma 3.1. The above combinatorial construction results in ( 7 p 7 - 1 ,p 7 ,p, 7 ,{ 0 ,l})- 
designs that have the following properties : 





Combinatorially Designed LDPC Codes 



197 



[1] The resulting ( 7 , p) -regular LDPC code has code length n — p 7 and rate 

R = ( 1 - 1 IpV- 

[2] The girth of the corresponding Tanner graph > 2 7+1 . 

The proof of Lemma 3.1 is straightforward and, hence, is omitted. A comment is 
that, since the code rate is (1 — 1/p) 7 , to get a reasonable rate (i.e., not too small), 
either 7 is small or p is large. Hence for practical applications, p! > 7 p 1-1 is almost 
always satisfied which makes good construction possible and likely. 

Clearly, the performance of the above design under message-passing decoder 
is much affected by how blocks are bundled in a plane (i.e., how incidence matrix is 
defined). The desired bundle-rule contains enough structure to ease the construc- 
tion and implementation, and enough randomness to avoid recurrence of a bad 
pattern (like short cycles) in the design. Below we discuss two effective ways of 
using Zech logarithm and congruential sequences to construct bundle-rules. Other 
approaches like cyclic patterns from finite geometry are also possible. 



3.2. Zech Logarithm 

When p = p l where p is a prime number and t an integer, the bundle-rule can 
be described using a pseudo-random permutation table generated systematically 
from Zech logarithm arithmetics in Galois field GF(p t ). This is how it works. 
Let a be a (predetermined) primitive element in GF(p t ), the elements in GF(p*) 
can be represented as 0, 1, a, a 2 , ... , a:^ -2 , or equivalently — 00 , 0, 1, . . . , p— 2 where 
loga^ = k for 0<k<p — 2 and logO = — 00 . A permutation vector with seed io, 
denoted as rci Q , is constructed using Zech logarithm as follows 



_ /.n _nog(a io +<**), for j = 0,1,2,..., p-2, 
\log(o: i0 ), for j = p—1, 



(3.1) 



where io £ {— oo,0,l,...,p — 2}. Each permutation vector 7 r* 0 uniquely speci- 
fies a bundle in a plane, such that p blocks with subscripts (i, 7 t 2o (j)) where 
j = 0, l,...,p— 1 (for ease of proposition, we omit the irrelevant 7 — 2 indexes 
in the subscript) belong to the same bundle. Since different seed io results in a 
different permutation vector, there are altogether p permutation vectors which can 
be used to bundle the p 2 blocks in a plane. We illustrate this through the following 
example. 



Example. (Permutation Table from Zech Logarithm in GF(2 3 )) 

Consider p = 2 3 . We take the root of the minimal polynomial P(x) =x 3 +x + l 
in GF(2 3 ) as the primitive element a. Tab. 1 summarizes the Zech logarithm 
arithmetic and the resulting permutation table (—00 is denoted and interpreted 
as position p— 1(= 7) in the table). Fig. 2 shows how bundles are defined by 
permutation vector 7 To, 1*2 and 777 . We use boxes to denote blocks, and those 
connected to the same line are considered in one bundle. Hence, a permutation 
table uniquely defines p bundles (corresponding to p points) in a plane. 

Further, note that any two rows in the a permutation table can be exchanged 
which results in a different permutation table (and thus different bundle-rule). 




198 



Jing Li 



Table 1 . Permutation Table Constructed Using Zech Logarithm 
in GF(2 3 ) 



p 


Log 

log,*# 


2f) = 0 


1 


z. 

l0g 0 

2 


ech I 
t (a io 
3 


,og 

+ 0) 

4 


5 


6 


7 


1 001 


0 


7 


3 


6 


1 


5 


4 


2 


0 


a 010 


1 


3 


7 


4 


0 


2 


6 


5 


1 


a 2 100 


2 


6 


4 


7 


5 


1 


3 


0 


2 


O 

O 

CO 

8 


3 


1 


0 


5 


7 


6 


2 


4 


3 


o 

o 

8 


4 


5 


2 


1 


6 


7 


0 


3 


4 


a 5 100 


5 


4 


6 


3 


2 


0 


7 


1 


5 


O 

O 

T— < 

CD 

8 


6 


2 


5 


0 


4 


3 


1 


7 


6 


0 100 


7 


0 


1 


2 


3 


4 


5 


6 


7 






7T0 


7T 1 


7T2 


7T3 


7T 4 


7T 5 


7T6 


7T 7 




0 1 2 3 4 5 6 7 



Figure 2. Illustration of how permutation table defines bundle-rule 



When p\ > 7 p^ 1 , i.e., the number of permutation tables is larger than the number 
of principal planes, we have sufficient choices such that no two principal planes 
use the same bundle rule. This brings a good amount of pseudo-randomness into 
the construction to prevent short cycles. In the meantime, the basic permutation 
table (i.e., Zech logarithm arithmetic) can be implemented using simple hardware 
with small memory. 

3.3. Congruent ial Sequences 

Another efficient way to design pseudo-random permutation tables of dimension 
p x p is to use maximal length congruential sequences (or M-sequences) of length 
M — p 2 . 

If M = 2 k for some integer k, a k - tap linear feedback shift register (LFSR) 
can be used to generate an M-sequence. The characteristic polynomial of the LFSR 





Combinatorial^ Designed LDPC Codes 



199 




Figure 3. System diagram of a fc-tap linear feedback shift register 
depicted in Fig. 3 is given by: 

k 

f{D) = l~Y J c iD i , (3.2) 

i= 1 

where Ci s are connection variables, and D is a delay operator. When f(D ) is a 
primitive polynomial (which exists for all k > 1), and when the initial values, 
ao, ai, . . . , afc_i, are not all zeros, the corresponding binary LFSR sequence gen- 
erated through a n = =1 Cid n -i (where ^ stands for binary summation), has 
period 2 k — l (see for example [17]). This is what is used in code division mul- 
tiple access (CDMA) systems to generate binary pseudo-random sequences (also 
known as PN codes ) of period 2 k — 1 . It can be conveniently shown that an integer 
pseudo-random sequence of period 2 k — 1, {A n }, can be generated by combining k 
consecutive terms in the binary LFSR sequence, namely, 

k— 1 

[n n , fln+ 1 5 • • • ? & n+k—l]binary = ^ ^ . (3.3) 

i = 0 

Any 2 k — 1 consecutive integers in the sequence lead to a length 2 k — 1 (integer) 
M-sequence. Notice that this sequence covers numbers from 1 to 2 k — 1 (without 
number 0). Hence, inserting a 0 to any position of this sequence leads to a pseudo- 
random M-sequence of length M = 2 k . 

A different approach that is just as simple but more general (i.e., applicable 
to any non-zero N) is to use the algebraic formula [18]: 

A n = a A n - 1 -I- b mod M. (3.4) 

To ensure that the resulting sequence is of maximal length, the parameters a and 
b need to satisfy: 

• a < M, 6<M, b be relatively prime to M; 

• (a-1) be a multiple of p, for every prime p dividing M; 

• (a-1) be a multiple of 4 if M is a multiple of 4. 

• (optional) a be relatively prime to M. 

The mapping of a length N = p 2 maximal congruential sequence to a p x p 
permutation table can be defined arbitrarily in principle, but it is desirable for the 
mapping rule to contain both structure (for easy description and implementation) 
and randomness (for good performance). For example, a simple way is to fill the 
M-sequence in the permutation table and sort and order the elements in each row. 








200 



Jing Li 



Example. (Permutation Table from Congruential Sequences) 

For the case of p = 6, let us pick a = 13 and 6 = 5. The length M = 36 sequence 
generated using (3.4) is as follows (starting with Ao =0): 

00, 05, 34, 15, 20, 13, 30, 35, 28, 09, 14, 07, 24, 29, 22, 03, 08, 01, 

18, 23, 16, 33, 02, 31, 12, 17, 10, 27, 32, 25, 06, 11, 04, 21, 26, 19. 

There are many ways to fill the M-sequence in the table, like row- wise, column-wise, 
diagonal- wise or any other cyclic pattern. Tab. 2(A) illustrates the zig-zag filling 
pattern that starts from the top-left corner and proceeds diagonally from top-right 
to bottom- left. Then, sorting and ordering each row, we obtain a permutation table 
as shown in Tab. 2(B). By cyclically shifting any one row or several rows, a new 
permutation table (and therefore a new bundle rule) will result. 

Table 2. Permutation Table Constructed Using Congruential Sequences 



00 


05 


15 


30 


14 


03 




0 


2 


4 


5 


3 


1 


34 


20 


35 


07 


08 


33 




4 


2 


5 


0 


1 


3 


13 


28 


24 


01 


02 


10 




3 


5 


4 


0 


1 


2 


09 


29 


18 


31 


27 


06 




1 


5 


3 


2 


4 


0 


22 


23 


12 


32 


11 


21 




3 


4 


1 


5 


0 


2 


16 


17 


25 


04 


26 


19 




1 


2 


4 


0 


5 


3 



3.4. The Simplest Case of 7 = 2 

In this subsection, we discuss a simple case, 7 = 2, of the above design (since it is 
easily analyzable) and compare it to the Gallager ensemble. First, we note that this 
case (Fig. 1) is somewhat special in that the resulting LDPC ensemble contains 
only one code for each given p (if the relevant order of the bits in the codeword 
is ignored). Second, instead of using the aforementioned procedure and labelling 
blocks with 2-tuples, all points and blocks can be labelled using a scaler and their 
relations can be conveniently specified as follows: for a set of point containing even 
number of points, denoted as V = {^1,^2 , . . . , V 2 p-\, ^2 p}, a block B is composed 
of two points from V, v x and Vj, where 

i = j + k mod 2p, Vk = 1, 3, 5, . . . 2[^] — 1. (3.5) 

The example of 7 = 2, p = 4 is shown in Fig. 1. We have the following lemma for 
this class of (2, p)-regular LDPC codes: 

Lemma 3.2. The (2, p)-regular LDPC codes from the above design have the follow- 
ing properties: 

[1] They are a class of high-rate, systematic codes with code length n = p 2 , rate 
R — (1 — 1 /p) 2 and girth 8. 

[2] They are quasi- cyclic LDPC codes. 

[3] They are linear time encodable and linear time decodable. 






Combinatorially Designed LDPC Codes 



201 



The above properties can be conveniently verified. Here are a few comments. 
First, we note that shifting a valid codeword leftward or rightward by p bits pro- 
duces another valid codeword (quasi-cyclicity). However, the codewords are not 
M-sequences since the codeword length p 2 is a multiple of the period p. Second, 
the encoder can be implemented with a linear shift register with feedback connec- 
tions based on its generator polynomial which eliminates the necessity of storing 
the generator matrix. Third, the linear time encodability (a property that is not 
readily attainable for random LDPC codes [3]) can be either inferred from quasi- 
cyclicity or from the following lemma: 

Lemma 3.3. For an LDPC code specified by an m x n parity check matrix H , if 
there are at least (m — 1) weight-2 columns which do not complete a cycle among 
them, then encoding can be performed with linear time in n. 

Proof As shown in Fig. 1(c), we can rearrange these weight-2 columns to make 
the corresponding matrix diagonal or sub-diagonal. Clearly, this realization can 
be encoded linear time using back substitution [16]. Furthermore, the parity check 
matrix in Fig. 1(c) also presents a form of irregular repeat accumulate (IRA) 
codes [14], where the left sub-diagonal part of the H matrix plays the role of an 
accumulator 1/(1 ® D), and the right part functions to repeat data bits and form 
checks among them. It is well known that IRA codes are linear time encodable. 

It is worth mentioning that the above (2, p)-regular LDPC codes can also be 
viewed as a special type of 2-dimensional turbo product codes (TPC) constructed 
from arrays of single-parity check (SPC) codes. Fig. 1(d) presents the same (2,4)- 
regular LDPC code in an equivalent TPC/SPC format. We note, however, that 
the general case (7, p)-regular codes (7 > 3) from the proposed design are not 
TPC/SPC codes. The major differences include that 1) a TPC/SPC code is de- 
terministic and rigid in structure, where the proposed LDPC ensemble contains a 
variety of realizations and pseudo-randomness for 7 > 3; and 2) a 7-dimensional 
TPC/SPC code of length n has girth 2 7+1 and contains approximately ^ cycles 
of length 2 7+1 (for large n^> 2 7 ), whereas the proposed LDPC ensemble has girth 
> 2 7+1 (worse case construction has girth 2 7+1 ), and the number of length 2 7+1 
cycles is small (due to pseudo-randomness in bundle rule, we expect this number 
to decrease with the increase of n). 

3.5. Distance Spectrum Analysis 

In his original construction [1], Gallager specified an ensemble of (7, p) -regular 
LDPC codes whose m x n parity check matrix H can be horizontally split into 
7 sub-matrices of dimensionality ^ x n each, where each sub-matrix has uniform 
column weight 1 and row weight p (denote such a sub-matrix as if( ljP )). We refer 
to this ensemble as the Gallager ensemble , since Gallager has used it to derive 
many useful results concerning the properties of LDPC codes and the iterative 
decoding. It can be seen from the construction procedure (as well as Fig. 1(b)) that 
the proposed (7, p)-regular LDPC ensemble is a subset of the Gallager ensemble. 




202 



Jing Li 



Below we evaluate and compare the distance spectrum of the proposed subset with 
the whole set. 

Distance spectrum is useful in evaluating the ensemble average performance 
(assuming an optimal decoder), but is generally hard to compute for an LDPC 
ensemble. For the Gallager ensemble with random constructions, the expectation 
(i.e., average) of the output weight enumerator function (OWEF) can be derived 
fairly easily. For the structured (7, p)-regular ensemble proposed above, a closed- 
form expression for OWEF involves tedious mathematics. Hence, we consider only 
the simple case of 7 = 2. 



Example. (Gallager Ensemble) 

Considering Gallager ensemble of (7, p)-regular codes with code length n, the par- 
ity check matrix, #( 7w0 ), constitutes of 7 sub-matrices, #(i, p ), each of which has 
output weight enumerator function [1] 



A{\ , p ){w) = B(w) * B(w) * • • • * B(w), 

s 1 'V ^ 

p 



where * denotes convolution operation and 



B(w) 




w even, 
w odd. 



(3-6) 



(3.7) 



The average OWEF of (7, p)-regular Gallager ensemble is thus given by 



Hhp) 



a 



(3.8) 



Example. ( (2p, p 2 , p, 2, {0, 1})-Designed LDPC Ensemble) 

The codes resulted from this design are an alternative form of 2-dimensional 
TPC/SPC codes. Hence, the exact OWEF (rather than the ensemble average) 
can be computed using [19] 

*m<»>=£E(!!)( E (3-9) 

a=0 ' ' \{3 even,P=0 / 

where 

= E, -»*(:) (':*)• (310 » 

In Tab. 3, we compare the output weight enumerators (in logarithm scale) 
of the Gallager ensemble (2, p)-regular (random) LDPC codes and the proposed 
(2,p)-regular (structured) LDPC codes. We observe that the proposed structured 
codes are better than the ensemble average of random codes, with fewer codewords 
at the low weight end of the distance spectrum. In other words, the proposed codes 
are above average in the maximum likelihood sense. 

It is worth noting that the number of weight- 2 columns in an LDPC code 
usually needs to be limited in order for the code to be asymptotically “good” on 




Combinatorially Designed LDPC Codes 



203 



Table 3. Comparing the Output Weight Enumerator of Gallager 
Ensemble (Random) LDPC Codes and the Proposed (Structured) 
Combinatorial Designed LDPC Codes ((2,16)-regular, n= 256, 
Logarithm Scale) 



Output weight 
w 


Gallager 

loglo(Au) 


Proposed 

log l0 (A w ) 


2 


2.0529 


- 


4 


4.2471 


4.1584 


6 


6.4509 


6.2745 


8 


8.6383 


8.5254 


10 


10.7988 


10.7074 


12 


12.9265 


12.8570 


14 


15.0175 


14.9651 


16 


17.0689 


17.0300 


18 


19.0781 


19.0500 


20 


21.0431 


21.0232 


22 


22.9620 


22.9483 


24 


24.8333 


24.8242 


26 


26.6558 


26.6499 


28 


28.4286 


28.4249 


30 


30.1510 


30.1488 



AWGN channels (“good” in the sense as MacKay defined in [2]). For irregular 
codes, the upper limit of the weight-2 columns is determined by the stability 
condition. For regular codes, the column weight needs to be at least 3, since it 
was shown by Gallager that only with 7 > 3 will the average minimum distance 
of a regular LDPC ensemble increase linearly with the code length [1]. Hence, 
(2,p)-regular LDPC codes are not “good” codes on AWGN channels. However, 
when (2, p)-regular codes are used with a modulation or channel that has memory, 
the modulation/channel will provide another level of parity check (either binary 
or nonbinary) to the coded bits from LDPC codes. In other words, (2, p)-regular 
LDPC codes can be “good” codes in such cases. Further, recent work has shown 
that regular LDPC codes are asymptotically optimal (i.e., reaching to the i.i.d. 
capacity of the channel) on a dicode channel [11]. We conjecture the result to be 
valid on general ISI channels too. 

4. Application on PRML Channels 

One possible application for the proposed structured LDPC codes, especially the 
(2,p)-regular codes that are both simple and high-rate, is the digital recording 
systems. This section evaluates their performance on ideal PR magnetic recording 
channels. 




204 



Jing Li 




Figure 4. System model for LDPC-coded PRML channels. 



Typical PR channel models used in magnetic recording systems include PR- 
IV channel family whose channel response takes the form of H(D) = (1 + D)(l — 
D ) q , where q— 1 is PR4, q — 2 is EPR4 channel, and q = 3 is E 2 PR4 channel. A block 
diagram of the LDPC-coded PRML channel and a matching decoder is shown in 
Fig. 4. We consider a soft-in soft-out iterative decoding and equalization (IDE, also 
known as turbo equalization) receiver which composes of an inner BCJR decoder 
matched to the PR channel and an outer message- passing decoder matched to the 
LDPC code. Further, a random interleaver is inserted between the LDPC code 
and the PR channel to break up the correlation among code bits and to bring up 
possible interleaving gain (the interleaver size is an integer multiple of the outer 
LDPC code length). 

The PR channel in a magnetic recording system is usually binary precoded 
whose traditional role is to limit error propagation in the threshold detectors for ISI 
channels, but has recently acquired another important role of improving the dis- 
tance spectrum in an iterative process. To facilitate the choice of a good precoder, 
the i.i.d. capacity is computed using density evolution with Gaussian approxi- 
mation. The i.i.d. capacity of the system is computed as the maximum mutual 
information between input and output LLRs of the system 



I 



1 poo 

\ £ / log 

A d=± 1 



2/,j Code) (0 .. 



(4.1) 



where d = ±1, f^ ch \l) and f^ code \l) are the pdf’s of the input LLRs (from the 
channel) to the LDPC-coded PR system and the output LLRs from the system 
after joint decoding/detection, respectively. 

The i.i.d. capacity of the proposed (2,p)-LDPC codes on EPR4 systems is 
plotted in Fig. 5 where several binary precoders are evaluated. We see from the 
plot that 1/(1 ©D©D 2 ) and 1/(1® D 2 ®D 3 ) are apparently worse precoders than 
the other two. Whereas 1/(1 ®D) and 1/(1 ®D 2 ) present almost identical i.i.d. ca- 
pacities, simulation results with finite lengths shows that 1/(1 ®D 2 ) seems to yield 
slightly better performance and, hence, will be used throughout the simulations. 











Combinatorially Designed LDPC Codes 



205 



i.i.d. capacity for (2,p)-regular LDPC codes on EPR4 channels 




Figure 5. I.i.d. capacity of (2p, p 2 ,p, 2, {0, l})-designed LDPC 
codes on PRML channels with different precoding. 

The performance of PRML magnetic recording channels employing the pro- 
posed regular LDPC codes from (2p, p 2 ,p, 2, {0, l})-design is evaluated via com- 
puter simulations. We consider 2 basic code rates of R = 0.88 and 0.94, 3 interleaver 
sizes of 1024, 2048 and 4096 bits, and 3 typical channel models of PR4, EPR4 and 
E 2 PR4, respectively. Fig. 6 plots the bit error rate (BER) curves of a rate 0.88 
code from (32, 256, 16, 2, {0, l})-design on PR4, EPR4, and E 2 PR4 channels with 
a precoder 1/(1 ® D 2 ). Although not shown, for uncoded PRML system to reach 
BER of 10 -5 , 10.25, 10.5 and 10.8 dB are required for PR4, EPR4, and E 2 PR4 
channels, respectively. Hence, we see that 4 — 5 dB gains are achievable by the pro- 
posed codes. Interleaving gain phenomenon is also observed from the plot, where 
the increase of the interleaver length from IK to 4K brings an additional 0.5 dB 
gain. 

Fig. 7 compares the BER performance of structured LDPC codes (from 
(32, 256, 16, 2, {0, l})-design and (64, 1024, 32, 2, {0, l})-design) with that of ran- 
dom (3,p)-regular LDPC codes. Computation of I.i.d. capacity reveals that the 
structured LDPC codes used here perform best with a precoder 1/(1 0 D 2 ) and 
that the random LDPC codes perform best without a precoder. The BER curves 
of the best cases for both codes are plotted, which shows that they are compara- 
ble in performance; however, the proposed structured LDPC codes are simpler in 
structure. 

5. Conclusion 

We propose and discuss in this work a systematic construction of regular LDPC 
codes from (7 p 7_1 , p 7 , p, 7, {0, 1}) combinatorial design. The resulting LDPC codes 
contain a good combination of pseudo-randomness and structure. Investigation on 





206 



Jing Li 



R=0.88, LDPC from combinatorial design 

10 



10 



tr 

LU 

co 

io- 4 



10- 5 



io -6 

3.5 4 4.5 5 5.5 6 6.5 

Eb/No (dB) 

Figure 6. BER performance of rate 0.88 LDPC codes from com- 
binatorial designs 

Structured and random LDPC codes 

10 



io - 3 

0C 

LLI 

CO 

io -4 



io - 5 



io - 6 



Figure 7. Comparison of the proposed structured LDPC codes 
and random LDPC codes 



their performance on PR magnetic recording channels shows that they perform 
as well as or slightly better than the average random LDPC codes, yet their well- 
defined structure allows them to be implemented at a much lower cost than random 
codes. This is also in agreement with the result from [11] that regular LDPC codes 
are asymptotically optimal on ISI channels. 



Structured and random LDPC codes 




.5 3 3.5 4 4.5 5 5.5 6 6. 



Eb/No (dB) 







Combinatorially Designed LDPC Codes 



207 



References 

[1] R.G. Gallager, Low-density parity-check codes MIT press, Cambridge, MA, 1963. 

[2] D. J. MacKay and M.C. Davey, Evaluation of Gallager codes for short block length and 
high rate applications Proc. of the IMA Workshop on Codes, System and Graphical 
Models, (1999). 

[3] T. Richardson, and R. Urbanke, Efficient encoding of low-density parity-check codes 
IEEE Trans. Inform. Theory, Feb. 2001. 

[4] S.J. Johnson, and S.R. Weller, Construction of low-density parity-check codes from 
Kirkman Triple Systems Proc GLOBECOM, San Antonio, Nov. 2001, 770-974. 

[5] J. Li and E. Kurtas, A class of (7p 7-1 , p 7 , p, 7, {0, 1}) combinatorially designed 
LDPC codes with applications to ISI channels , Proc. IEEE Inti. Symp. Inform. 
Theory, Yokohama, Japan, June 2003, 29-29. 

[6] Y. Kou, S. Lin, and M.P.C. Fossorier, Low-density parity-check codes based on finite 
geometries: a rediscovery and new results , IEEE Trans. Inform. Theory, Vol 47, Nov. 
2001. 2711-2736. 

[7] E. Kurtas, B. Vasic, and A.V. Kuznetsov, Design and analysis of low density parity 
check codes for applications to perpendicular recording channels The Wiley Encyclo- 
pedia of Telecom., (invited chapter), 2002. 

[8] A. Prabhakar, and K.R. Narayanan, Pseudo-random construction of low density par- 
ity check codes using linear congruential sequences IEEE Trans. Commun., vol. 50, 
Sept. 2002, 1389-1396. 

[9] I.J. Rosenthal, and P. Vontobel, Construction of LDPC codes using Ramanujan 
graphs and ideas from Margulis Proc Inti. Symp. on Inform. Theory, 2001. 

[10] R.M. Tanner, D. Srkdhara, and T. Fuja, A class of group-structured LDPC codes 
Proc. Inti. Conf. on Inform. Tech, and Applications, Ambleside, England. 

[11] A. Kavcic, B. Marcus, M. Mitzenmacher, and B. Wilson, Deriving performance 
bounds for ISI channels using Gallager codes Proc. Inti. Symp. Inform. Theory, 
June 2001, 345-345. 

[12] R.C. Bose, On the construction of balanced incomplete block designs Ann. Eugenics 
9, 1939, 353-399. 

[13] B. Ammar, B. Honary, Y. Kou, and S. Lin, Construction of low density parity check 
codes Inti. Symp. Inform. Theory, Switzerland, June 2002, 311-311. 

[14] H. Jin, A. Khandekar and R. McEliece, Irregular repeat- accumulate codes 2nd Inti. 
Symp. on Turbo Codes and Related Topics, Brest, France, Sept 2000. 

[15] J. Li, K.R. Narayanan, E. Kurtas, and C.N. Georghiades, On the performance of 
high-rate TPC/SPC codes and LDPC codes over partial response channels IEEE 
Trans. Commun., May 2002, vol. 50, 723-734. 

[16] L. Ping, W.K. Leung, and N. Phamdo, Low density parity check codes with semi- 
random parity check matrix Electronics Letters, vol. 35, no. 1, Jan. 1999, 38-39. 

[17] Andrew Viterbi, CDMA: Principles of Spread Spectrum Communication , Prentice 
Hall, 1995. 




208 



Jing Li 



[18] G.C. Clark, Jr. and J.B. Cain, Error- correction coding for digital communications , 
Plenum Press, NY, 1981. 

[19] G. Caire, and C. Taricco, Weight distribution and performance of the iterated product 
of single-parity- check codes , Proc. GLOBECOM Conf., 1994, 206-211. 



Jing Li 

Department of Electrical and Computer Engineering 
Lehigh University 
19 Memorial Dr. W. 

Bethlehem, PA 18015, USA 
e-mail: JingLi@ece.lehigh.edu 




Progress in Computer Science and Applied Logic, Vol. 23, 209-222 
© 2004 Birkhauser Verlag Basel/Switzerland 



New Constructions of Const ant- Weight Codes 

Lei Li and Shoulun Long 



Abstract. By generalizing a propagation rule for binary constant-weight 
codes, we present three constructions of binary const ant- weight codes. It turns 
out that our constructions produce binary constant-weight codes with good 
parameters. 

Mathematics Subject Classification (2000). Primary 94B60; Secondary 94B65. 
Keywords. Constant-weight codes, sets, linear spaces, free modules. 



1. Introduction 

Constant-weight codes have a very long history because of both practical appli- 
cations and theoretical interests. Various methods from algebra, finite geometry, 
combinatorics, etc., have been employed to construct good codes. The reader may 
refer to [6] and [2] for a good survey on this topic. 

In this paper, we first give a simple propagation rule by identifying a binary 
constant- weight code with a family of subsets. This idea is further generalized 
to linear spaces and free modules to construct binary constant-weight codes with 
reasonable parameters. 



2. Preliminaries 

In this section, we introduce some concepts and definitions that will be used in 
the next sections. 

2.1. Const ant- Weight Codes 

A binary constant-weight code C C is a set of codewords that have the same 
(Hamming) weight. C is called an (n, M, d ; w) constant-weight code if C is a set of 
cardinality M, such that each codeword has the same weight w , and the distance 
between any two codewords is at least d. Given n,d and w , to determine the 
maximum possible size A(n, d, w) of an (n, M, d; w) binary constant-weight code 
is an important problem in coding theory. 




210 



Lei Li and Shoulun Long 



In calculating the distance between two codewords we have a useful formula: 

Proposition 2.1 ([8], Lemma 4.3.4). For any two codewords x= (x\, X 2 , . . . , x n ) 
and y = (yi, y 2 , . . . , y n ) ™ putx* y = (xij/i, x 2 t/ 2 , • • • , £n2/n), then 

d(x, y) = iyt(x + y) = utf(x) + wt( y) - 2w£(x * y). (2.1) 

2.2. Gaussian Coefficients 

Given a prime power q and two positive integers k , r with k < r, the number 

H A FILr-fc+lfa* ~ *) 

W," n-=i(9 i -i) 

is called a Gaussian coefficient. For the convenience of later usage, we define [£] = 
0 if k > r. The significance of Gaussian coefficients is described in the following 
proposition. 

Proposition 2.2 ([8], Theorem 5.1.12). Let F g be a finite field and V a linear space 
of dimension r over F q . Then the number of dimension k(< r) subspaces of V is 

>1 = (q r - l)(g r - g)--- (<f ~g fc ~ x ) 

Al q (q k - i)(9 fc - q) ■ ■ • (q k - q k ~ l ) 

2.3. Linearized Polynomials and Rank Distance Codes 

We first review rank distance codes studied by Gabidulin in [3]. Let A = {Ai} be 
a set of t x m matrices over a finite field F q . The distance d(A,B) between two 
matrices A and B in A is defined by d(A, B) = rank(A — B) and the minimum 
distance of A, denoted by d( A), is defined as d( A) = min {d(A,B) : A / B G A}. 
Let d = d( A) and M — |A|. We call A a (t x ra,M, d) rank distance code. For a 
(t x m, M, d) rank distance code A, the Singleton bound is valid, i.e., 

d(A) <£ — /-+- 1, (2-2) 

where l — log qm M. Codes for which equality holds in (2.2) are referred to as 
MRD- codes (Maximum- Rank-Distance codes) . 

In [5], Johansson presents a method for constructing MRD-codes from lin- 
earized polynomials. Let 1 < l < t < m be positive integers. A polynomial of the 
form 

F{x) = Y^hx q \ 

i= 0 

where fi G F g m is called a linearized polynomial Denote all linearized polynomials 
of degree not higher than q l ~ l as 

t 

Pi,t,m = {F(x) = Y^hx q ' ■ fi G Fq"*, deg(F(x)) < q l ~ *}. 

1=0 

Assume gi , g 2 , • • • , gt are specified elements in the field F q m which are linearly 
independent over F q , and for each F(x) G Pi,t,m, put 

A F = (F( gi ),F(g 2 ),...,F(g t )) T . 




New Constructions of Constant- Weight Codes 



211 



Fix a basis of F q m over ¥ q , write each F(gi) = (an, a^, . . . , a* m ) expressed in 
this fixed basis as a row vector, where each entry a tJ G ¥ q . Therefore, each A F 
can be viewed asatxm matrix (a^) over ¥ q , and A = {A F : F(x) G 
can be viewed as a rank distance code. Moreover Johansson proved that A is an 
MRD-code. 

Theorem 2.3 ([5], Lemma 3). A = {A F : F(x) G R,*, m } is an MRD-code. That is, 
A is a (t x m, q ml ,t — l + 1 ) rank distance code. 

2.4. Free Modules 

Let R be a ring and M an R - module (cf. [4], Ch. IV). A subset 5 of M is said to be 
R-linearly independent provided that for distinct x\, x 2 , . . . , x n G S and n G R , 
r\Xi + 7 * 2 X 2 + • • • + T n x n = 0 deduces r\ = 7*2 = • • • = r n = 0. An R-linearly 
independent subset of M that spans M is called a basis of M, and M is called a 
free R-module if M has a basis. 

If R is a commutative ring with identity and M is a free R-module, then 
each basis of M has the same cardinality. In this case, the number of elements in 
a basis of M is called the rank of M, denoted by rank M. 

Proposition 2.4 ([4], Ch. IV, Theorem 2 . 1 ). Let R be a commutative ring with 
identity and M a free R-module. If rankM=r, then M = R x Rx • • x R. 

^ V ^ 

r 

In the next sections, we always consider free modules over the congruence 
class ring Z m , which is of course a commutative ring with identity. 

3. New Constructions of Constant- Weight Codes 

We present our main work in this section. In the first part, we identify each binary 
constant- weight code with a family of subsets, then give a propagation rule for 
constant- weight codes from this relationship. In the next two parts, we generalize 
the rule to linear spaces and free modules, respectively, which will lead to new 
constructions of binary const ant- weight codes. 

3.1. Sets 

Suppose A = {ai, a 2 , . . . , a n } is a set of cardinality n. Denote 2l A l = {S : S C A} 
to be the set of all subsets of A , then we can define a map ip : — > 2^1 as 

follows: 

^((•^1 1 5 * * • 5 *^ 71 )) \_a{ . Xi — 1 } . 

It is easy to verify that ip is a bijection. For each (rz, M, d ; w) binary constant- 
weight code C C F£, ip(C) = {Ai, A 2 , . . . , A M } C 2 |A| satisfies 

\Ai\=w<n, for i = 1 , 2 ,...,M; (3.1) 

and 

\Ai\ + \Aj\ - 2\Ai D Aj \ = 2 w- 2\Ai n Aj \ > d, 
for all 1 < i 7 ^ j < M. 



(3.2) 




212 



Lei Li and Shoulun Long 



Therefore, given an (n, M, d; w) binary constant- weight code C, we can find a family 
of subsets of A satisfying (3.1) and (3.2). 

On the other hand, if there is a family of subsets of A , denoted by {A\ , A 2 , . • • , 
Am }, satisfying the above conditions (3.1) and (3.2), we can also construct a family 
of binary const ant- weight codes. 

For each fixed s with 1 < s < w, suppose J3i, # 2 , . . . , ^( n ) are subsets of 
cardinality s of A. Construct a binary constant- weight code C = {ci , C 2 , . . . , cm } C 
IJ 2 as follows: 



(^1 7 ^2 7 * ' * 7 (' i 7l r ) 7 



where n = 




f 1 , Bj C A, 

\ 0 , Bj Ai 



(3.3) 



It’s obvious that wt(c^) = (™) for all i, and for any 1 < i ^ j < M, we have 

d(ci,Cj) = wt(cf) + wt(cj) - 2wt(cj * Cj) 

'\Ai n Aj\ 

s 



-2 



( 2w—d 

l 



A 



d' 



(by (3.2)). 



Thus, C is an ( n',M,d';w ') constant-weight code, where n' = ("), d! — 2(™) — 

2 (~T~)’ and«/= (”). 

Therefore, we can obtain an (n', M, d'; w f ) binary constant-weight code C 
with the above parameters from an (n,M, d\w) binary constant- weight code C. 
Given a lower bound N on some A(n, d, w), there must exist at least one (n, N, d ; w) 
constant- weight code, then we can construct new const ant- weight codes from this 
(n, N, d; w) code. In conclusion, we have the following theorem: 



Theorem 3.1. Given a lower bound N on an A(n , d, w), then for all 1 < s < w there 
exists an (n f , N,d';w' ) constant-weight code, where n' = (™) , d! — 2( s ) — 2( \ ), 
and w f = (™). Thus, A(n ' , d' ,w') > N. 



3.2. Linear Spaces 

In this part, we introduce a construction of binary const ant- weight codes from 
linear spaces similar to the one from sets in the previous part. 

Let V be a linear space of dimension r over a finite field ¥ q and s, t, r positive 
integers satisfying 1 < s < t < r. Put V\, V 27 • • • 7 Vjrj to be all dimension s sub- 
spaces of V. Then given some dimension t subspaces W\, W 2 , . . . , Wm (M < [£] ) 
of V, we can construct a binary constant- weight code C — {ci,C 2 , . . . ,c m} in a 
similar way: 



C i = (c^ , c i2 , . . . , c in ), where n = 





T; C IT, 



(3.4) 




New Constructions of Constant- Weight Codes 



213 



The parameters of C can be easily determined. Obviously, wt(c^) = [*] for 

all i. Denote 6 = max{dim(W; fl Wj) : 1 < i ^ j < M} < t - 1, then for any 
1 

d(ci, c j) = wt(ci) + wt(cj) — 2wt(cj * c j) 



f ] -2 


'dim(Wi n W j )'] 






s J, 


*1 -2 


'ff 

S 


Q 



So C is an ([^] g ,M, 2[^ — 2 [^] ^ ; [*] ) constant-weight code. Hitherto, we have 
already proven the following theorem: 

Theorem 3.2. Let V be a linear space of dimension r over a finite field F q . If there 
exists a set of subspaces ofV, denoted by Q(r,M,t,6), satisfying 

(i) |fl| = M; 

(ii) dim (W)=t,VWeQ; 

(hi) dim (W H W f ) < 0, VW ^ W' E n, 

then there exists an ( [^] , M, 2 [*] — 2 [^] ; [*] ) binary constant-weight code for 
alll<s<t. 

Now our problem turns to be how to find fi(r, M, t, 0) with M as large as 
possible. We partly solve this problem by obtaining fl(r, M, £, 0) from rank distance 
codes. The relationship between Q(r,M,t,6) and rank distance codes has been 
established by Theorem 1 and 3 in [9]. 

Theorem 3.3 ([9]). If there exists a (t x m, M, t — 0) rank distance code over F q , 
then there exists an fl(m + t, M, t, 0) over F q . 

We adopt Johansson’s way ([5]) to construct rank distance codes with good 
parameters. As we’ve restated in Section 2.3, the code A = {Ap : F(x) e Pi,t,m} 
in Theorem 2.3 is a (t x m.cfi 711 ,t — l + 1) rank distance code where l < t < m. 
By Theorem 3.3 there must exist an Q(t + m, g m/ , t, l — 1) and hence a family of 
([ m +%>^ mZ >2[*] g — 2[ z “ 1 ]^; [*]^) constant-weight codes for all 1 < s < £. So we 
get the following proposition: 

Proposition 3.4. Let F q be a finite field and 1 < l < t < m positive integers, then 
for all 1 < s < t, 

A(n,d,w) > q ml , (3.6) 

where n = [ m + t ] q ,d= 2fc] q - 2[ l ~ 1 ] q , and w = [*],. 

We claim this family of fl(t + m, q ml , t, l — 1) are “optimal” , in sense that they 
will reach their maximum possible sizes gradually as q goes to infinity. To show 
this, we give an upper bound on the size of Q(r, M, t , 0) at first, then compare the 
parameters of Q(t + m, g mZ , t, l — 1) with the upper bound. 




214 



Lei Li and Shoulun Long 



Theorem 3.5 (Upper bound on |ft(r, M, £, 0)|). Let M(r, £,0) be the maximum 
possible size of fl(r, M, £, 0), then 

\ r 1 

(3.7) 

L0+llg 

Proof Suppose V is a dimension r linear space over ¥ q and f2(r,M,£,0) = {Wi,W 2 , 

. . . , VLm} a set of subspaces of V with M = M(r, £, 0). Let U be a dimension 0 + 1 
subspace of V\ If U was contained in a VL*, we assert that 1/ cannot be contained 
in any other Wj with j ^ i. Otherwise, dim(IT; D W 3 ) > dim!/ = 0+1, which 
contradicts with the definition of fi(r, M, t, 0). 

The number of dimension 0 + 1 subspaces of V is , and there are [ d +i] q 
dimension 0+1 subspaces contained in each W x , therefore 

r r l 



For any fixed r and /c, as q — > +00 we have 

>1 = (q r -l)(g r - 1 -l)"-(g r ~ fc+1 -l) 

M q ( q k - i)( 9 fc_1 - 1) - - - (9 — 1) 

~ q( r ~k)k ' 



It follows that 



M(r, £,0) < 



r 

. 0 + 1 . q 

q (r-0-l)(6 + l) 

q {t-e-\){e+ 1 ) 
(r-t)(0+l) 

H » 



(3.8) 



(3.9) 



when q — > +00. Clearly, the set fi(£ + m, q ml ,t, l — 1) will reach the upper bound 
on M(r,t,0) gradually as <7 goes to infinity. 

Since Q(t + m, — 1) is “optimal”, the corresponding const ant- weight 

codes obtained from our construction should have good parameters, too. To show 
this, we compare the lower bound on A(n, d, w) in Proposition 3.4 with an upper 
bound introduced in [1]. 



Proposition 3.6 ([1], Theorem 12). Let u — w — d/2 + 1. Then 



A(n , d, w) < 



© 

O' 



(3.10) 




New Constructions of Constant- Weight Codes 



215 



Still using notations in Proposition 3.4 and 3.6, we get that when q 

I m + 1] 



n = 



w 



(m+t — s)s. 



J <7 






M~s)s. 



— w — dj 2 + 1 — 



l-l 

s 



+ Is 



By Stirling’s Formula n! ss v / 2wn(^)”, (n — > +oo), we have 



• +oo, 

(3-11) 

(3.12) 

(3.13) 



A(n, d, w) < 



O 

C) 

n\(w — n)! 
w\(n — u)l 

/ w 
\w — u 



w — u 



n—u+^ 






n — u / 

n—w 



1 + 



q(t-i+i)s _ i 



1 + 



1 ~q~ 



q(t-l+i)s _ ^ J J 



— (n— u+ 



(3.14) 



when q goes to infinity. 

When q is large, if we choose s that is close to l (especially we can take s = /), 
then q n ll can be very close to the upper bound g msu . Therefore, our construction 
can produce binary const ant- weight codes with good parameters in such cases. In 
the next section, we will give some examples to illustrate the significance of these 
codes. 



3.3. Free Z m -Module 

In this part, we give a construction of binary constant- weight codes from free 
Z m -modules. The reason why we choose Z m as the coefficient ring is that the 
number of free submodules of a free Z m -module of finite rank is finite and easy to 
calculate, thus the parameters of binary const ant- weight codes constructed from 
free Z m -modules can be easily determined. 

Let m > 1 be a positive integer, and Z m be the congruence class ring mod 

m. Then ZJ^ = Z m x Z m x • • • Z m is a free module over Z m of rank r. Denote the 

r 

number of rank k(< r ) free submodules of ZJ^ by F rn ^ r (k). Suppose 1 < s < t < r 
and {Ai, -A 2 , . . . , A n } is the set of all rank s free submodules of ZJ^, where n = 
Fm, r {s). Similarly, given some rank t free submodules of Z^: B\, • • • , Bm{M < 




216 



Lei Li and Shoulun Long 



Fm,r(t)) such that rank(£?; D Bj) < 6 < t - 1 for any 1 < i ^ j < M, we can 
construct a binary constant- weight code C = {ci, c 2 , . . . , c m} as follows: 

C i = (oq , Ci 2 , • . . , Ci n ), where c ij = | q ’ ^ g r] ' ( 3 - 15 ) 

It is easy to verify that C is an (n, M, 2F m ,t(s) — 2F m> 0(s); F m ,t(s)) binary constant- 
weight code. To determine the parameters of C, we just need to calculate F m?r (fc). 

In the following, we are going to get a formula to calculate F m?r (/c) in several 
steps. Firstly, we have a useful lemma. 

Lemma 3.7. Write m in the form m — p^p^ 2 • • • pf 1 , where pi ’s are different primes 
and ei > 0 for all i. Then 

FmAk) = F p? , r (fc)F p e 2 r (k) • • • F pV , r (fc). (3.16) 

Proof. By the Chinese Remainder Theorem, Z m = Z p «i © Z p « 2 © • • • © Z p «t . Put 

R = Z„ei © Z «2 © • • • © Z e, , then R r = R x R x ■ ■ ■ R = Z r m and F m . r (k) equals 
P 1 F2 Pt ^ ^ ^ 

r 

to the number of rank k free F-submodules of R r . 

For any x = (xi , X 2 , . . . , x r ) T E F r , we write x in the form 

Xu X\2 ' ' * X\t 

X2\ X 22 • • * X2t 



X r \ Xqr2 ’ * " Xj'l 

where xj = (xji,Xj 2 , . . . , Xj t ) E F is the jth row of x for all j = 1,2,..., r. Let 
col(x)i = (xii,x 2 i , . . . , denote the zth column of x for all i = 1, 2, . . . , t. Then 
col(x)* E Z r e; , and for any A = (Ai, A 2 , . . . , A t ) E F, 

Pi 

Ax = (Aicol(x)i, A 2 Col(x) 2 , • • • , A t col(x) t ). 

Suppose A is a rank k > 1 free R-submodule of R r . Put A t c ZC, to be the 

P z 

set of all the ith columns of elements in A, i.e., 

Ai = {col(x)i E U e x : x = (xi,X 2 , . . . , x r ) E A C R r },i = 1, 2, . . . , t. (3.17) 

Pi 

We assert that A is a rank k free Z -submodule of Z r ei . In fact, take an F-basis 

Pz Pi 

{ x ( 1 ), x ( 2 ), >>> ,x( /e )} of A, it’s easy to check that {col(x^ 1 ^)i,col(x^)i,...,col(x^^)i} 
is a Zp e z -basis of A;. 

Put M = {A C F r : rankA = k} and N = {( F?i, # 2 , • • • , Ft) : B{ C , 

ranki^ = k}, then we can define a map 

ip : M — > N, A i—* (Ai , A 2 , • . • , At ). 

It is sufficient to prove that p is bijective. 

We show that p is injective at first. Suppose there exist A,5 eM satisfying 
p{A) = </?(£) = (Ai, A 2 , • . . , A t ). Take an F-basis {x (1) ,x (2 \ . . . ,x (/c) } of A and 






New Constructions of Constant- Weight Codes 



217 



{y (1 ),y (2) , • • • ,y (fc) } of B. Then {col(x( n ))* : n = 1, 2, . . . , k} and (col(y( n ))* : n = 
1,2 ,...,&} are two Z ^ -bases of A*. So there exist some A*i, A* 2 , . . . , \k £ Z ^ 
for each i = 1, 2 , . . . , t such that 

k 

col(y (1) )i = ^2 AmCol(x (n) )i. 

n= 1 

Thus 

y (1) = (col(y (1) )i,col(y (1) ) 2 ,...,col(y (1) ) t ) 

k 

= ^XnX^eA, 

n= 1 

where A n = (Ai n , A 2n , • • • , X tn ) G # (n = 1,2, Similarly, we can prove 

y ( 2 \ . . . ,y( fc ) G A, which deduces B C A. In the same way, we can get A C B. So 
A = B, i.e., is injective. 

Conversely, for each (Ai, A 2 , . . . , A t ) G N, we are going to find an A G M 
such that <p(A) = (Ai, A 2 , ... ,A t ). Assume {an, a ^, . . . , o^} is a Z ^ -basis of 

A*. For each n = 1, 2, . . . , k, take ai n G Z r ei as the ith coordinate of to form 

Pi 

x( n ) = (ai n ,a 2 nj • • • ,a tn ) G i? r . We assert that {x^\x^ 2 \ . . . ,x^} are ^-linearly 
independent. If there exists {A* = (A*i , A* 2 , . . . , A**) G R : i — 1,2,..., A:} such that 

Aix (1) + A 2 x ( 2) + • • • + A fc x (fc) = 0, 

then 

Alia*! + \2i&i2 + • • • T A ki&ik — 0 G Z , 1 < 2 < t. 

Since { an,ai 2 , . . . , a*/-} is Z ^ -linearly independent, all the A^, A 2 *, • • • , Xki must 
be 0. Hence Ai = A 2 = • • • = Afc = 0, that is, {x^\ x^ 2 \ . . . , x^} is i?-linearly inde- 
pendent. Let A = (x^\x ^ 2 \. . . , x^) be the R- module spanned by {x^\x^ 2 \ . . . , 
x^}, then A is a rank k free i2-submodule of R r and </?(A) = (Ai, A 2 , . . . , A t ). 
The proof is finished. □ 

By Lemma 3.7 we just need to calculate F p e, r (k), where p is a prime and 
e > 1. Ahead of the calculation, we still need some lemmas. 

Lemma 3.8. Assume p is a prime, e > 1 an integer, and m = p e . Suppose 
{xW,x^,.^x^} C ZJ^ is Z m -linearly independent, then it can be extended 
to a Zm-basis {x^\x^ 2 \ . . . ,x^ k \x^ k+1 \ . . . , x^} of ZJ^. 

Proof We use induction on k. 

(1) Assume k = 1. 

Suppose x^ 1 ) = . . . ,Xr^) T . Since {x^ 1 )} is Z m -linearly indepen- 

dent, (Xx^,Xx^\ . . . ,Xxi^) 7^ 0 for all A ^ 0 G Z m . So g.c.d{x^\x^\ • • • 
can not be divided by p. Without loss of generality, we may assume p \ x^\ i.e., 
is invertible in Z m . 




218 



Lei Li and Shoulun Long 



Take the standard Z m -basis {e* = (0, . . . , 0, 1, 0, . . . , 0) r : i = 1, 2, . . . . r} of 
where e* has only one nonzero coordinate 1 at the zth position. Obviously, 

x (1) = x[^ei + x^e2 H f x^e r . 



ei = -x[ l>} {x^e2 H h x^e r - x^). 

Thus {ei,e2, . . . ,e r } can be Z m -linearly represented by {x^\e2, . . . , e r }, then 
{x^\e 2 i . . . ,e r } is a Z m -basis of Z^ as well. 

(2) Suppose this lemma holds for k — 1. 

Since {x^\x^ 2 \ . . . ,x^ fc_1 ^} is also Z m -linearly independent, by our assump- 
tion it can be extended to a Z m -basis {x^\x^ 2 \ . . . ,x^"^,y^,. . . ,y^} of Z^. 
So we can write x W in the form 

x (fc) = Aix (1) + A 2 x ( 2) + • • • + A/ C _ix (fc-1) + A fc y (/c) + • • ■ + A r y (r) . 



t one 


A n in {A k 


» Afc+l, 


. . . , A r } that is invertible in 


‘/c-(- 1 > • 


. . , A r ), we 


have 






k — 1 




n 


P 6 - 1 






£ W n) 




n=l 




n—k 




/c-1 






P 6 - 1 


n=l 







which contradicts with the assumption that {x^, x^ 2 \ . . . , x^} is Z m -linearly 
independent. 

We may assume p \ Aj~, then 

y (fc) = -A fc - 1 (^A„x<")+ A„y (n) - 

n— 1 n=k + 1 

Thus {x^^x^, . . . ,x^ fc )} can be extended to a Z m -basis {x( 1 ),x( 2 ),...,x( fc ),y( fc+1 \ 

• ••>y (r) } ofZ r m . 

The result follows. □ 



Corollary 3.9. On the same condition as Lemma 3.8, A = (x^\x^ 2 \ . . . , x^) is a 
free Zm-module of rank k, and Z\ ^/A is a free Zm-module of rank r — k. 

Proof. A = 0^ =1 Z m x( n ) = Z^ is of course a free Z m -module of rank k. By 
Lemma 3.8, {x^ , x^ 2 ) , . . . , x ^ } can be extended to a Z m -basis {x^ , x^ 2 ) , . . . , x ^ , 
x(* +1 >, . . . ,x< r )} of Z r m . Therefore 

r 

0 1mX {n) =Z r - k . 
n=/c+l 

□ 




New Constructions of Const ant- Weight Codes 



219 



Now we can begin to calculate F p e, r (k). 

Lemma 3.10. Assume p is a prime, e > 1 and m = p e . Then 



r (k) = p Hr-k)(e-l) 



r 

iA, P 



1 < k < r. 



(3.18) 



Proof. Let N m ^ r (k) denote the number of Z m -linearly independent sets of car- 
dinality k in ZJ^, then F mjr (k) = N rn ^ r (k)/N rn ^k{k). We go on using induction 
on k. 



(1) Assume k = 1 . 

An element x = (xi, x^, . . . , x r ) T € ZJ^ can be taken as a Z m -basis to span 
a rank 1 free Z m -module if and only if Ax ^ 0 for all A •=/=■ 0 € Z m , that is, 
p \ g.c.d(x i,X 2 > . . . , x r ). So there are p re — p r ( e_1 ) elements in ZJ^ that can span a 
rank 1 free Z m -module. 

On the other hand, in any rank 1 free Z m -module, there are </>(m) = p e —p e ~ l 
elements that can be taken as a Z m -basis. Thus, 



FmA !) 



pre _ pr(e-l) 

p e — p e ~^ 
p (r-l)(e-l)^r _ y 

P~ 1 



P 



(r-l)(e-l) 



(2) Suppose this lemma holds for k — 1. 

For any Z m -linearly independent set of cardinality k {x^ , x^ 2 ^ , . . . , x^ }, put 
A = . . . , x( fc-1 )). By Corollary 3.9, A is a free Z m -module of rank k — 1, 

and (x( fc )} is Z m -linearly independent in Z^/A = Z^~ fc+1 . 

On the other hand, to choose a Z m -linearly independent set {x^\x^ 2 \ . . . , 
x ( /e - 1 )} in Z^, we have N m , r (k - 1) choices. To make {x^\x^ 2 \ . . . ,x^} Zm- 
linearly independent, (x( fc )} must be Z m -linearly independent in ZJ^/A. Thus, 
x( fc ) has Nrn,i — fc+i (1) choices in ZJ^/A. A coset x( fc ) has m k ~ l elements in ZJ^, so 
x ^ has m fc_1 A^m,r-fe+i(l) choices in ZJ^. Then 



NmA k ) = mk 1 N rn , r - k +i(l)NmA k ~ l )/ k • 




220 



Lei Li and Shoulun Long 



Therefore, 

FmAk) = N m Ak)/N m . k (k) 

fc+l (l)-^m,r 1) 

_ l(l)F m ,r(fc - 1) 

T’m.fcCfc - 1) 

? t(r - |i ) (t - i) ir, =r - t+ i(p n - 1 ) 

n‘=i(p n -i) 

— D fc(r-fc)(e-l) r 

L J p 

as we needed. □ 

Finally, we get a formula to calculate F m , r (fc). 

Theorem 3.11. Lei m be a positive integer. Write m in the form m 
where pi ’s are different primes and e* > 0 for all i. Then 

F m Ak) = f[p^ r - k){e '- 1) \ r i ] ,1 <k<r. 

i= 1 ^Jp, 

Proof. By Lemma 3.10 and Lemma 3.7. □ 

Hitherto, we have got a formula to calculate F m , r (fc), and hence determined 
the parameters of C. But unfortunately, for fixed 9 < t — 1, we have not found a 
way to obtain some rank t free submodules of B\, B2, . . . , L?m, such that M 
is as large as possible and rank(B; n Bj) < 6 for any 1 < i / j < M. Here is a 
particular result for 6 = t - 1: 

Theorem 3.12. Let m be a positive integer. Then for all 1 < s <t < r, 

A(F m ^ r (s),2F rn A(s) - 2F m? t_i(s),F m ,t(s)) > F m , r (i). (3.20) 

4. Examples 

In this section, we give some explicit examples from our constructions in Section 3. 
These examples improve some lower bounds [7] on binary const ant- weight codes, 
and hence show the significance of our constructions. 

Example. According to Theorem 3.1, A(n, d, w) > N will deduce ^(( 2 ) > 2 ( 2 ) — 

2( 2 ) 5 (^)) > N . We choose some lower bounds on A(n, d, ic) from [7], and give 
some deduced lower bounds in Table 1. 

Note that all of these new lower bounds in Table 1 can improve known lower 
bounds in [7]. 





New Constructions of Constant- Weight Codes 



221 



Table 1 



Lower bounds from [7] 


Deduced lower bounds 


4(8, 4, 6) > 4 


4(28,18,15) > 4 


4(9, 4, 6) > 12 


4(36,18,15) > 12 


4(10,4,6) > 30 


4(45,18,15) > 30 


4(11,4,6) >66 


4(55,18,15) > 66 


4(12,4,6) > 132 


4(66, 18, 15) > 132 



Table 2 



q 


m = t = 2,l = s = l 


8 


64 <4(512, 16, 8) <65 


11 


121 < 4(1331,22,11) < 122 


13 


169 < 4(2197,26, 13) < 170 


17 


289 < 4(4913, 34, 17) <290 


19 


361 < 4(6859,38,19) < 362 



Table 3 



q 


m — t — l — 2^s = \ 


8 


4096 < 4(512, 14, 8) < 4745 


11 


14641 < 4(1331,20,11) < 16226 


13 


28561 < 4(2197, 24, 13) < 31110 


17 


83521 < 4(4913,32, 17) < 89030 


19 


130321 < 4(6859,36,19) < 137922 



Example. Let 1 < / < t < m be positive integers. Taking specific s that is close to 
Z, we calculate the lower and upper bounds on some A(n,d,w), and list them in 
Table 2 and Table 3. 

The lower and upper bounds in Table 2 and Table 3 are very close, which 
illustrates that our construction in Section 3.2 can really produce constant- weight 
codes with good parameters. 

Example. Let Z m = Zq. Assume r = 4, t = 3, and s = 1, we get 1*6,4 (1) = 1*6,4 (3) = 
600, 1 * 6,3 (1) = 91 and 1 * 6,2 (1) = 12, so there exists a (600,600, 156,91) constant- 
weight code by our construction from free Z m -module, i.e., A(600, 156,91) > 600. 







222 



Lei Li and Shoulun Long 



Acknowledgment 

The authors would like to thank Prof. Chaoping Xing for helpful comments and 
corrections. 

References 

[1] E. Agrell, A. Vardy and K. Zeger, “Upper Bounds for Const ant- Weight Codes,” 
IEEE Trans. Inform. Theory , vol. 46, No. 7, Nov. 2000. 

[2] J.H. Conway and N.J.A. Sloane, Sphere Packings, Lattices and Groups , 3rd ed. New 
York: Springer, 1999. 

[3] E.M. Gabidulin, “Theory of Codes with Maximum Rank Distance,” Problems of 
Information Transmission , 21(1), 1985. 

[4] T.W. Hungerford, Algebra , Springer- Verlag, 1974. 

[5] T. Johansson, “Authentication Codes for Nontrusting Parties Obtained from Rank 
Metric Codes,” Designs, Codes and Cryptography , vol. 6, pp. 205-218, 1995. 

[6] F.J. MacWilliams and N.J.A. Sloane, The Theory of Error- Correcting Codes , Ams- 
terdam, The Netherlands: North- Holland, 1977. 

[7] E.M. Rains, Table of Constant Weight Binary Codes [Online], Available: 
http:/ / www. resear ch.att.com/ njas/codes/Andw/index.html. 

[8] S. Roman, Coding and Information Theory , Springer- Verlag, 1992. 

[9] R. Safavi-Naini, H. Wang and C. Xing, “Linear Authentication Codes: Bounds and 
Constructions,” Indocrypt’01, Lecture Notes in Computer Science , Vol. 2247, 2001, 
pp. 127-135. 



Lei Li and Shoulun Long 

Department of Mathematics 

University of Science and Technology of China 

Hefei, Anhui 230026 

P.R. China 

e-mail: harrylee@mail . ustc . edu . cn 
e-mail: lsl@mail.ustc.edu.cn 




Progress in Computer Science and Applied Logic, Vol. 23, 223-226 
© 2004 Birkhauser Verlag Basel/Switzerland 



Good Self-Dual Quasi- Cyclic Codes 
over F 9 , q Odd 

San Ling and Patrick Sole 



Abstract. We show that there are long self-dual q-aiy quasi-cyclic codes above 
the Gilbert- Varshamov bound for odd q. We use Hughes’s (u + v|u — v) 
construction. 

Mathematics Subject Classification (2000). Primary 94B15. 

Keywords. Self-dual codes, quasi-cyclic codes, Gilbert- Varshamov bound, (u+ 
v|u — v) construction. 



1. Introduction 

It has been known for more than a quarter of a century that long self-dual g-ary 
codes exist [8]. 

Recently Hughes introduced the (u + v|u — v) construction [2, 3] for codes 
over fields of odd characteristic. In [5] the present authors show that all q - ary 
quasi-cyclic codes of index half the length over a field of odd characteristic can 
be obtained in that way. In a companion paper [6], building on the results of [1], 
the authors study asymptotically good self-dual binary quasi-cyclic codes. Here, 
following [8], we study the existence of asymptotically good self-dual <7- ary quasi- 
cyclic codes, where q is an odd prime power. 

2. Combinatorics 

Throughout this paper, we assume q to be an odd prime power and that all the 
codes over F q are equipped with the Euclidean inner product. 

The following collection of preliminary results may be found in [8, p. 37]: 

Proposition 2.1. Let £ be a positive even integer if q = 1 mod 4 and let £ be a 
positive multiple of A if q = 3 mod 4. 



The research of the first-named author is partially supported by MOE-ARF research grant R- 
146-000-029-112 and DSTA research grant R-394-000-01 1-422. 




